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Title: Streptococcus suis vaccines and diagnostic tests. 

The invention relates to Streptococcus infections of 
pigs, to vaccines directed against those infections, to tests 
for diagnosing Streptococcus infections and to the field of 
bacterial vaccines, more in particular to vaccines directed 
5 against Streptococcus infections. 

Streptococcus species, of which there are a large variety 
causing infections in domestic animals and man, are often 
grouped according to Lancefield' s groups. Typing according to 
Lancefield occurs on the basis of serological determinants or 

10 antigens that are among others present in the capsule of the 
bacterium and allows for only an approximate determination, 
often bacteria from a different group show cross-reactivity 
with each other, while other Streptococci can not be assigned 
a group-determinant at all. Within groups, further 

15 differentiation is often possible on the basis of serotyping; 
these serotypes further contribute to the large antigenic 
variability of Streptococci, a fact that creates an array of 
difficulties within diagnosis of and vaccination against 
Streptococcal infections . 

20 Lancefield group A Streptococcus species (GAS, 

Streptococcus pyogenes) , are common with children, causing 
nasopharyngeal infections and complications thereof. Among 
animals, especially cattle are susceptible to GAS, whereby 
often' mastitis is found. 

25 Group A streptococci are the etiologic agents of 

streptococcal pharyngitis and impetigo, two of the commonest 
bacterial infections in children, as well as a variety of less 
common but potentially life-threatening infections, including 
soft tissue infections, bacteraemia, and pneumonia. In 

30 addition, GAS are uniquely associated with the postinfectious 
autoimmune syndromes of acute rheumatic fever and 
poststreptococcal glomerulonephritis . 

Several recent reports suggest that the incidence both of 
serious infections due to GAS and of acute rheumatic fever has 
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often lack a serologically detectable capsule, a large 
majority of strains associated with neonatal infection belong 
to one of four major capsular serotypes, la, lb, II or III. 
The capsular polysaccharide forms the outermost layer around 
5 the exterior of the bacterial cell, superficial to the cell 
wall. The capsule is distinct from the cell wall-associated 
group B carbohydrate. It has been suggested that the presence 
of sialic acid in the capsule of bacteria that cause 
meningitis is important for these bacteria to breach the 
10 blood-brain barrier. Indeed, in S. agalactiae sialic acid has 
shown to be critical for the virulence function of the type 
III capsule. The capsule of S. suis serotype is composed of 
glucose, galactose, N-acetylglucosamine, rhamnose and sialic 
acid. 

15 The group B polysaccharide, in contrast to the type- 

specific capsule, is present on all GBS strains and is the 
basis for serogrouping of the organisms into Lancefield 1 s 
group B. Early studies by Lancef ield and co-workers showed 
that antibodies raised in rabbits against whole GBS organisms 

20 protected mice against challenge with strains of homologous 
capsular type, demonstrating the central role of the capsular 
polysaccharide as a protective antigen. Studies in the 1970s 
by Baker and Kasper demonstrated that cord blood of human 
infants with type III GBS sepsis uniformly had low or 

25 undetectable levels of antibodies directed against the type 
III capsule, suggesting that a deficiency of anticapsular 
antibody was a key factor in susceptibility of human neonates 
to GBS disease. 

Lancefield group C infections, such as those with S. 

30 egui, S. zooepidemicus , S. dysgalactiae, and others are mainly 
seen with horse, cattle and pigs, but can also cross the 
species barrier to humans. Lancefield group D (S. bovls) 
infections are found with all mammals and some birds, 
sometimes resulting in endocarditis or septicaemia. 

35 Lancefield groups E, G, L, P, U and V (S. porcinus, 5, 

canis, S. dysgalactiae) are found with various hosts, causing 



4 

WO 00/05378 PCT/NL99/00460 

neonatal infections, nasopharyngeal infections or mastitis. 

Within Lancefield groups R, S, and T, (and with ungrouped 
types) S. suis is found, an important cause of meningitis, 
septicemia, arthritis and sudden death in young pigs. 
5 Incidentally, it can also cause meningitis in man. 

Streptococcus suis is an important cause of meningitis, 
septicemia, arthritis and sudden death in young pigs (4, 46) . 
Incidentally, it can also cause meningitis in man (1). S.suis 
strains are usually identified and classified by their 

10 morphological, biochemical and serological characteristics (58, 
59, 46) . Serological classification is based on the presence of 
specific antigenic polysaccharides. So far, 35 different 
serotypes have been described (9, 56, 14) . In several European 
countries, S. suis serotype 2 is the most prevalent type 

15 isolated from diseased pigs, followed by serotypes 9 and 1. 
Serological typing of S. suis is carried out using different 
types of agglutination tests. In these tests, isolated and 
biochemically characterised S. suis cells are agglutinated with 
a panel of 35 specific sera. These methods are very laborious 

20 and time-consuming. 

Little is known about the pathogenesis of the disease caused 
by S. suis, let alone about its various serotypes such as type 
2. Various bacterial components, such as extracellular and 
cell-membrane associated proteins, fimbriae, haemaglutinins, 

25 and haemolysin have been suggested as virulence factors (9, 10, 
11, 15, 16, 47, 49) . However, the precise role of these protein 
components in the pathogenesis of the disease remains unclear 
(37). It is well known that the polysaccharidic capsule of 
various Streptococci and other gram-positive bacteria plays an 

30 important role in pathogenesis (3, 6, 35, 51, 52) . The capsule 
enables these micro-organisms to resist phagocytosis and is 
therefore regarded as an important virulence factor. Recently, 
a role of the capsule of S. suis in the pathogenesis was 
suggested as well (5) . However, the structure, organisation and 

35 functioning of the genes responsible for capsule polysaccharide 
synthesis (cps) in 5. suis is unknown. Within S. suis serotypes 
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1 and 2 strains can differ in virulence for pigs (41, 45, 49) . 
Some type 1 and 2 strains are virulent, other strains are not. 
Because both virulent and non-virulent strains of serotype 1 
and 2 strains are fully encapsulated, it may even be that 
5 capsule is not a relevant factor required for virulence. 

Attempts to control S. suis infections or disease are 
still hampered by the lack of knowledge about the epidemiology 
of the disease and the lack of effective vaccines and 
sensitive diagnostics. It is well known and generally accepted 

10 that the polysaccharidic capsule of various Streptococci and 
other gram-positive bacteria plays an important role in 
pathogenesis. The capsule enables these micro-organisms to 
resist phagocytosis and is therefore regarded as an important 
virulence factor. 

15 Compared to encapsulated S. suis strains, non- 

encapsulated S. suis strains are phagocytosed by murine 
polymorphonuclear leucocytes to a greater degree. Moreover, an 
increase in thickness of capsule was noted for in vivo grown 
virulent strains while no increase was observed for avirulent 

20 strains. Therefor, these data again demonstrate the role of 
the capsule in the pathogenesis for S. suis as well. 

CJngrouped Streptoccus species, such as S. /nutans, causing 
carries with humans, S, uteris, causing mastitis with cattle, 
and S. pneumonia, causing major infections in humans, and 

25 Enterococcus faecilalis and E. faecium / further contributed to 
the large group of Streptococci. 

Streptococcus pneumoniae (the pneumococcus) is a human 
pathogen causing invasive diseases, such as pneumonia, 
bacteraemia, and meningitis. Despite the availability of 

30 antibiotics , pneumococcal infections remain common and can 

still be fatal, especially in high-risk groups, such as young 
children and elderly people. Particularly in developing 
countries, many children under the age of five years die each 
year from pneumococcal pneumonia. S. pneumoniae is also the 

35 leading cause of otitis media and sinusitis. These infections 
are less serious, but nevertheless incur substantial medical 
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costs, especially when leading to complications, such as 
permanent deafness. The normal ecological niche of the 
pneumococcus is the nasopharynx of man. The entire human 
population is colonised by the pneumococcus at one time or 
5 another, and at a given time, up to 60% of individuals may be 
carriers. Nasopharyngeal carriage of pneumococci by man is 
often accompanied by the development of protection to 
infection by the same serotype. Most infections do not occur 
after prolonged carriage but follow the acquisition of 

10 recently acquired strains. Many bacteria contain surface 

polysaccharides which act as a protective layer against the 
environment. Surface polysaccharides of pathogenic bacteria 
usually make the bacteria resistant to the defense mechanisms 
of the host, e.g., the lytic action of serum or phagocytosis. 

15 In this respect, the serotype-specif ic capsular polysaccharide 
(CP) of Streptococcus pneumoniae, is an important virulence 
factor. Unencapsulated strains are avirulent, and antibodies 
directed against the CP are protective. Protection is serotype 
specific; each serotype has its own, specific CP structure. 

20 Ninety different capsular serotypes have been identified. 
Currently, CPs of 23 serotypes are included in a vaccine. 

Vaccines directed against Streptococcus infections in 
general aim at utilising an immune response directed against 
the polysaccharide capsule of the various Streptococcus 

25 species, especially since the capsule is considered a main 
virulence factor for these bacteria. The capsule, during 
infection, provides resistance to phagocytosis and thus 
promotes the escape of the bacteria from the immune system of 
the host, protecting the bacteria by elimination by 

30 macrophages and neutrophils. 

The capsule particularly confers the bacterium resistance 
to complement-mediated opsonophagocytosis . In addition, some 
bacteria express capsular polysaccharides (CPs) that mimic 
host molecules, thereby avoiding the immune system of the 

35 host. Also, even when the bacteria have been phagocytosed, 
intracellular killing is hampered by the presence of a 
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capsule. 

It is in general thought that only when the host has 
antibodies or other serum-factors directed against capsule 
antigens, the bacterium will get recognised by the immune 
5 system through the anticapsular-antibodies or serum-factors 
bound to its capsule, and will, through opsonisation, get 
phagocytosed and killed. 

However, these antibodies are serotype-specif ic, and will 
often only confer protection against only one of the many 

10 serotypes known within a group of Streptococci . 

For example, current commercially available suls 
vaccines, which are in general based on whole-cell-bacterial 
preparations, or on capsule-enriched fractions of S. suis, 
confer only limited protection against heterologous strains. 

15 Also, the current pneumococcal vaccine, licensed in the United 
States in 1983, consists of purified CPs of 23 pneumococcal 
serotypes whereas at least 90 CP types exist. 

The composition of this pneumococcal vaccine was based on 
the frequency of the occurrence of disease isolates in the US 

20 and cross-reactivity between various serotypes. Although this 
vaccine protects healthy adults against infections caused by 
serotypes included in the vaccine, it fails to raise a 
protective immune response in infants younger than 18 months 
and it is less effective in elderly people. In addition, the 

25 vaccine confers only limited protection in patients with 
immunodeficiencies and haematology malignancies. 
In the light of above, improved vaccines are needed against 
Streptococcus infections. Much attention is being paid at 
producing CP vaccines by producing the relevant polysaccharides 

30 via chemical or recombinant means. However, chemical synthesis 
of polysaccharides is costly, and capsular polysaccharide 
synthesis by recombinant means necessitates knowledge about the 
relevant genes, which are not always available and need to de 
determined for each and every relevant serotype. 



35 
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The invention provides an isolated or recombinant nucleic 
acid encoding a capsular (cps) gene cluster of Streptococcus 
suis. Biosynthesis of capsule polysaccharides in general has 
been studied in a number of Gram-positive and Gram-negative 
5 bacteria (32). In Gram-negative bacteria, but also in a number 
of gram-positive bacteria, genes which are involved in the 
biosynthesis of polysaccharides are clustered at a single 
locus. Streptococcus suis capsular genes as provided by the 
invention show a common genetic organisation involving three 

10 distinct regions. The central region is serotype specific and 
encodes enzymes responsible for the synthesis and 
polymerisation of the polysaccharides. This region is flanked 
by two regions conserved in Streptococcus suis which encode 
proteins for common functions such as transport of the 

15 polysaccharide across the cellular membrane. However, in 
between species, only low homologies exist, hampering easy 
comparison and detection of seemingly similar genes. Knowing 
the nucleic acid encoding the flanking regions allows type- 
specific determination of nucleic acid of the central region of 

20 Streptococcus suis serotypes, as for example described in the 
experimental part of the description of the invention. 

The invention provides an isolated or recombinant nucleic 
acid encoding a capsular gene cluster of Streptococcus suis or 
a gene or gene fragment derived thereof. Such a nucleic acid 

25 is for example provided by hybridising chromosomal DNA derived 
from any one of the Streptococcus suis serotypes to a nucleic 
acid encoding a gene derived from a Streptococcus suis 
serotype 1, 2 or 9 capsular gene cluster, as provided by the 
invention (see for example Tables 4 and 5) and cloning of 

30 (type-specific) genes as for example described in the 

experimental part of the description. At least 14 open reading 
frames are identified. Most of the genes belong to a single 
transcriptional unit, identifying a co-ordinate control of 
these genes, they, and the enzymes and proteins they encode, 

35 act in concert to provide the capsule with the relevant 

polysaccharides. The invention provides cps genes and proteins 
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encoded thereof involved in regulation (CpsA) , chain length 
determination (CpsB, C) , export (CpsC) and biosynthesis (CpsE, 
F, G, H, J, K) . Although the overall organisation seemed at 
first glance to be similar to that of the cps and eps gene 
5 clusters of a number of Gram-positive bacteria (19, 32, 42), 
overall homologies are low (see table 3) . The region involved 
in biosynthesis is located at the centre of the gene cluster 
and is flanked by two regions containing genes with more 
common functions. 

10 The invention provides an isolated or recombinant nucleic 
acid encoding a capsular gene cluster of Streptococcus suis 
serotype 2 or a gene or gene fragment derived thereof, 
preferably as identified in Figure 3. Genes in this gene 
cluster are involved in polysaccharide biosynthesis of 

15 capsular components and antigens. For a further description of 
such genes see for example Table 2 of the description, for 
example a cpsA gene is provided functionally encoding 
regulation of capsular polysaccharide synthesis, whereas cpsB 
and cpsC are functionally involved in chain in chain length 

20 determination. Other genes, such as cpsD, E, F, G, H, I, J, K 
and related genes, are involved in polysaccharide syntheses, 
functioning for example as glucosyl- or glycosyltransf erase. 
The cpsF, G, H, I, J genes encode more type-specific proteins 
than the flanking genes which are found more-or-less conserved 

25 throughout the species and can serve as base for selection of 
primers or probes in PCR-amplif ication or cross-hybridisation 
experiments for subsequent cloning. 

For example, the invention further provides an isolated or 
30 recombinant nucleic acid encoding a capsular gene cluster of 
Streptococcus suis serotype 1 or a gene or gene fragment 
derived thereof, preferably as identified in Figure 4. 

In addition, the invention provides an isolated or 
recombinant nucleic acid encoding a capsular gene cluster of 
35 Streptococcus suis serotype 9 or a gene or gene fragment 
derived thereof, preferably as identified in Figure 5. 
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Furthermore, the invention provides for example a fragment or 
parts thereof of the cps locus, involved in the capsular 
polysaccharide biosynthesis, of S. suis, exemplified in the 
experimental part for serotype 1, 2 or 9, and allows easy 
5 identification or detection of related fragments derived of 
other serotype of 5. suis. 

The invention provides a nucleic acid probe or primer 
derived from a nucleic acid according to the invention 
allowing species or serotype specific detection of 

10 Streptococcus suis. Such a probe or primer (herein used 

interchangeably) is for example a DNA, RNA or PNA (peptide . 
nucleic acid) probe hybridising with capsular nucleic acid as 
provided by the invention. Species specific detection is 
provided preferably by selecting a probe or primer sequence 

15 from a species-specific region (e.g. flanking region) whereas 
serotype specific detection is provided preferably by 
selecting a probe or primer sequence from a type-specific 
region (e.g. central region) of a capsular gene cluster as 
provided by the invention. Such a probe or primer can be used 

20 in a further unmodified form, for example in cross- 
hybridisation or polymerase-chain reaction (PCR) experiments 
as for example described in the experimental part of the 
description of the invention. Herein the invention provides 
the isolation and molecular characterisation of additional 

25 type specific cps genes of S. suis types 1 and 9. In addition, 
we describe the genetic diversity of the cps loci of serotypes 
1, 2 and 9 among the 35 S. suis serotypes yet known. Type- 
specific probes are identified. Also, a type-specific PCR for 
for example serotype 9 is provided, being a rapid, reliable 

30 and sensitive assay, which is used directly on nasal or 
tonsillar swabs or other samples of infected or carrier 
animals . 

The invention also provides a probe or primer according to 
the invention further provided with at least one reporter 
35 molecule. Examples of reporter molecules are manifold and 

known in the art, for example a reporter molecule can comprise 
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additional nucleic acid provided with a specific sequence 
{e.g. oligo-dT) hybridising to a corresponding sequence to 
which hybridisation can easily be detected for example because 
it has been immobilised to a solid support. 
5 Yet other reporter molecules comprise chromophores, e.g. 
f luorochromes for visual detection, for example by light 
microscopy or fluorescent in situ hybridisation (FISH) 
techniques, or comprise an enzyme such as horseradish 
peroxidase for enzymatic detection, e.g in enzyme-linked 

10 assays (EIA) . Yet other reporter molecules comprise 

radioactive compounds for detection in radiation-based-assays. 

In a preferred embodiment of the invention, at least one 
probe or primer according to the invention is provided 
(labelled) with a reporter molecule and a quencher molecule, 

15 providing together with unlabeled probe or primer a PCR-based 
test allowing rapid detection of specific hybridisation. 

The invention further provides a diagnostic test or test kit 
comprising a probe or primer as provided by the invention. 
Such a test or test kit, for example a cross-hybridisation 

20 test or PCR-based test, is advantageously used in rapid 
detection and/or serotyping of Streptococcus suis. 
The invention furthermore provides a protein or fragment 
thereof encoded by a nucleic acid according to the invention. 
Examples of such a protein or fragment are for example 

25 proteins described in for example Table 2 of the description, 
for example a cpsA protein is provided functionally encoding 
regulation of capsular polysaccharide synthesis, whereas cpsB 
and cpsC are functionally involved in chain in chain length 
determination. Other proteins or functional fragments thereof 

30 as provided by the invention, such as cpsD, E, F, G, H, I, J, 
K and related proteins, are involved in polysaccharide 
biosynthesis, functioning for example as glucosyl- or 
glycosyltransferase in polysaccharide biosynthesis of 
Streptococcus suis capsular antigen. 

35 The invention furthermore provides a method to produce a 

Streptococcus suis capsular antigen comprising using a protein 
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or functional fragment thereof as provided by the invention, 
and provides therewith a Streptococcus suis capsular antigen 
obtainable by such a method. A comparison of the predicted 
amino acid sequences of the cps2 genes with sequences found in 
5 the databases allowed the assignment of functions to the open 
reading frames. The central region contains the type specific 
glycosyltransf erases and the putative polysaccharide 
polymerase. This region is flanked by two regions encoding for 
proteins with common functions, such as regulation and 

10 transport of polysaccharide across the membrane. 

Biosynthesis of Streptococcus capsular polysaccharide antigen 
using a protein or functional fragment thereof is 
advantageously used in chemo-enzymatic synthesis and the 
development of vaccines which offer protection against 

15 serotype-specif ic Streptococcal disease, and is also 

advantageously used in the synthesis and development of 
multivalent vaccines against Streptococcal infections. Such 
vaccines elicit anticapsular antibodies which confer 
protection. 

20 Furthermore, the invention provides an acapsular 

Streptococcus mutant for use in a vaccine, a vaccine strain 
derived thereof and a vaccine derived thereof. Surprisingly, 
and against the grain of common doctrine, the invention 
provides use of a Streptococcus mutant deficient in capsular 

25 expression in a vaccine. 

Acapsular Streptococcus mutants have long been known in 
the art and can be found in nature. Griffith (J. Hyg. 27:113- 
159, 1928) demonstrated that pneumococci could be transformed 
from one type to another. If he injected live rough (acapsular 

30 or unencapsulated) type 2 pneumococci into mice, the mice 

would survive. If, however, he injected the same dose of live 
rough type 2 mixed with heat-killed smooth (encapsulated) type 
1 into a mouse, the mouse would die, and from the blood he 
could isolate live smooth type 1 pneumococci. At that time, 

35 the significance of this transforming principle was not 

understood. However, understanding came when it was shown that 
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DNA constituted the genetic material responsible for 
phenotypic changes during transformation. 

Streptococcus mutants deficient in capsular expression 
are found in several forms. Some are fully deficient and have 
5 no capsule at all, others form a deficient capsule, 

characterised by a mutation in a capsular gene cluster. 
Deficiency can for instance include capsular formation wherein 
the organization of the capsular material has been re- 
arranged, as for example demnosrable by electron microscopy. 

10 Yet others have a nearly fully developed capsule which is only 
deficient in a particular sugar component. 

Now, after much advance of biotechnology and despite the 
fact that little is still known about the exact localisation 
and sequence of genes involved in capsular synthesis in 

15 Streptococci, it is possible to create mutants of 

Streptococci, for example by homologous recombination or 
transposon mutagenesis, which has for example been done for 
GAS (Wessels et al., PNAS 88:8317-8321, 1991), for GBS (Wesels 
et al., PNAS 86: 8983-8987, 1989), for S. suis (Smith, ID-DLO 

20 Annual report 1996, page 18-19; Charland et al., Microbiol. 
144:325-332, 1998) and for S. pneumonia (Kolkman et al., J. 
Bac't. 178:3736-3741, 1996). Such recombinant derived mutants, 
or isogenic mutants, can easily be compared with the wild-type 
strains from which they have been derived. 

25 In a preferred embodiment, the invention provides use of 

a recombinant-derived Streptococcus mutant deficient in 
capsular expression in a vaccine. Recombinant techniques 
useful in producing such mutants are for example homologous 
recombination, transposon mutagenises, and others, whereby 

30 deletions, insertions or (point ) -mutations are introduced in 
the genome. Advantages of using recombinant techniques are the 
stability of the obtained mutants (especially with homologous 
recombination and double cross-over techniques) , and the 
knowledge about the exact site of the deletion, mutation or 

35 insertion. 

In a much preferred embodiment, the invention provides a 
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stable mutant deficient in capsular expression obtainable for 
example through homologous recombination or cross over 
integration events. Examples of such a mutant can be found in 
the experimental part of this description, for example mutant 
5 lOcpsB or lOcpsEF is such a stable mutant as provided by the 
invention. 

The invention also provides a Streptococcus vaccine 
strain and vaccine that has been derived from a Streptococcus 
mutant deficient in capsular expression. In general, said 

10 strain or vaccine is applicable within the whole range of 

Streptococcal infections, be it for those with animals or man 
or with zoonotic infections. It is of course now possible to 
first select a common vaccine strain and derive a 
Streptococcus mutant deficient in capsular expression thereof 

15 for the selection of a vaccine strain and use in a vaccine 
according to the invention. 

In a preferred embodiment, the invention provides use 
of a Streptococcus mutant deficient in capsular expression in 
a vaccine wherein said Streptococcus mutant is selected from 

20 the group composed of Streptococcus group A, Streptococcus 
group B, Streptococcus suis and Streptococcus pneumonia. 
Herewith the invention provides vaccine strains and vaccines 
for use with these notoriously heterologous Streptococci, of 
which a multitude of serotypes exist. With a vaccine as 

25 provided by the invention that is derived from a specific 
Streptococcus mutant that deficient in capsular expression, 
the difficulties relating to lack of heterologous protection 
can be circumvented since these mutants do nor rely on 
capsular antigens per se to induce protection. 

30 In a preferred embodiment, said vaccine strain is 

selected for its ability to survive or even replicate in an 
immune-competent host or host cells and thus can persist for a 
certain period, varying from 1-2 days to more than one or two 
weeks, in a host, despite its deficient character. 

35 Although an immunodef icient host will support replication 

of a wide range of bacteria that are deficient in one or more 
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virulence factors, in general it is considered a 
characteristic of pathogenicity of Streptococci that they can 
survive for certain periods or replicate in a normal host or - 
host cells such as macrophages. For example, Wiliams and 
5 Blakemore (Neuropath. Appl. Neurobiol.: 16, 345-356, 1990; 
Neuropath. Appl. Neurobiol.: 16, 377-392, 1990; J. Infect. 
Dis.: 162, 474-481, 1990) show that both polymorphonuclear 
cells and macrophage cells are capable of phagocytosing 
pathogenic S. suis in pigs lacking anti-S. suis antibodies, 

10 only pathogenic bacteria could survive and multiply inside 
macrophages and the pig. 

In a preferred embodiment, the invention, however, 
provides a deficient or avirulent mutant or vaccine strain ■. 
which is capable of surviving at least 4-5 days, preferably at 

15 least 8-10 days in said host, thereby allowing the development 
of a solid immune response to subsequent Streptococcus 
infection, 

Due to its persistent but avirulent character, a 
Streptococcus mutant or vaccine strain as provided by the 

20 invention is well suited to generate specific and/or long- 
lasting immune responses against Streptococcal antigens, 
moreover because possible specific immune responses of the 
host directed against a capsule are relatively irrelevant 
because a vaccine strain as provided by the invention is in 

25 general not recognised by such antibodies. 

In addition, the invention provides a Streptococcus 
vaccine strain according the invention which strain comprises 
a mutant capable of expressing a Streptococcus virulence 
factor or antigenic determinant. 

30 In a preferred embodiment, the invention provides a 

Streptococcus vaccine strain according to the invention which 
strain comprises a mutant capable of expressing a 
Streptococcus virulence factor wherein said virulence factor 
or antigenic determinant is selected from a group of cellular 

35 components, such as muramidase-released protein (MRP) 
extracellular factor (EF) and cell-membrane associated 
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proteins, 60JcDA heat shock protein, pneumococcal surface 
protein A (Psp A), pneumolysis C protein, protein M, 
fimbriae, haemagglutinins and haemolysin or components 
functionally related thereto. 
5 In a much preferred embodiment, the invention provides a 

Streptococcus vaccine strain according to the invention which 
strain comprises a mutant capable of over-expressing said 
virulence factor. In this way, the invention provides a 
vaccine strain for incorporation in a vaccine which 

10 specifically causes a host to provide a immune response 
•directed against antigenically important determinants of 
virulence (listed above), thereby providing specific 
protection directed against said determinants. Over-expression 
can for example be achieved by cloning the gene involved 

15 behind a strong promoter, which is for example 

constitutionally expressed in a multicopy system, either in a 
plsamid or via intergration in a genome. 

In yet another embodiment, the invention provides a 
Streptococcus vaccine strain according to the invention which 

20 comprises a mutant capable of expressing a non-Streptococcus, 
protein. Such a vector-Streptococcus vaccine strain allows, 
when used in a vaccine, protection against other pathogens 
than Streptococcus. 

Due to its persistent but avirulent character, a 

25 Streptococcus vaccine strain or mutant as provided by the 

invention is well suited to generate specific and long-lasting 
immune responses, not only against Streptococcal antigens, but 
also against other antigens when these are expressed by said 
strain. Especially antigens derived from another pathogen are 

30 now expressed without the detrimental effects of said antigen 
or pathogen which would otherwise have harmed the host. 

An example of such a vector is a Streptococcus vaccine 
strain or mutant wherein said antigen is derived from a 
pathogen, such as Actinobacillus pleuropneumonia, 

35 Mycoplasmatae, Bordetella , Pasteurella , E. coli, Salmonella, 
Campylobacter, Serpulina and others. 
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The invention also provides a vaccine comprising a 
Streptococcus vaccine strain or mutant according to the 
invention and further comprising a pharmaceutical^ acceptable 
carrier or adjuvant. Carriers or adjuvants are well known in 
5 the art, examples are phosphate buffered saline, physiological 
salt solutions, (double-) oil-in-water-emulsions, 
aluminumhydroxide, Specol, block- or co-polymers, and others. 

A vaccine according to the invention can comprise a 
vaccine strain either in a killed or live form. For example, a 

10 killed vaccine comprising a strain having (over ) expressed a 
Streptococcal or heterologous antigen or virulence factor is 
very well suited for eliciting an immune response. In a 
preferred embodiment, the invention provides a vaccine wherein 
said strain is live, due to its persistent but avirulent 

15 character, a Streptococcus vaccine strain as provided by the 
invention is well suited to generate specific and long-lasting 
immune responses. 

Now that a Streptococcal vaccine is provided by the 
invention, the invention also provides a method for 

20 controlling or eradicating a Streptococcal disease in a 

population comprising vaccinating subjects in said population 
with a vaccine according to the invention. 

In a preferred embodiment, a method for controlling or 
eradicating a Streptococcal disease is provided comprising 

25 testing a sample, such as a blood sample, or nasal or throat 
swab, faeces, urine, or other samples such as can be sampled 
at or after slaughter, collected from at least one subject, 
such as an infant or a pig, in a population partly or wholy 
vaccinated with a vaccine according to the invention for the 

30 presence of encapsulated Streptococcal strains or mutants. 

Since a vaccine strain or mutant according to the invention is 
not pathogenic, and can be distinguished from wild-type 
strains by capsular expression, the detection of (fully) 
encapsulated Streptococcal strains indicates that wild-type 

35 infections are still present. Such wild-type infected subjects 
can than be isolated from the remainder of the population 
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until the infection has passed away. With domestic animals, 
such as pigs, it is even possible to remove the infected 
subject from the population as a whole by culling. Detection 
of wild-type strains can be achieved via traditional culturing 
5 techniques, or by rapid detection techniques such as PCR 
detection. 

In yet another embodiment, the invention provides a 
method for controlling or eradicating a Streptococcal disease 
comprising testing a sample collected from at least one 

10 subject in a population partly or wholly vaccinated with a 
vaccine according to the invention for the presence of 
capsule-specific antibodies directed against Streptococcal 
strains. Capsule specific antibodies can be detected with 
classical techniques known in the art, such as used for 

15 Lancefield's group typing or serotyping. 

A much preferred embodiment of a method provided by the 
invention for controlling or eradicating a Streptococcal 
disease in a population comprises vaccinating subjects in said 
population with a vaccine according to the invention and 

20 testing a sample collected from at least one subject in said 
population for the presence of encapsulated Streptococcal 
strains and/or for the presence of capsule-specific antibodies 
directed against Streptococcal strains. 

For example, a method is provided according to the 

25 invention wherein said Streptococcal disease is caused by 
Streptococcus suis . 

The invention also provides a diagnostic assay for testing a 
sample for use in a method according to the invention 
comprising at least one means for the detection of 
30 encapsulated Streptococcal strains and/or for the detection of 
capsule-specific antibodies directed against Streptococcal 
strains . 

The invention furthermore provides a vaccine comprising an 
antigen according to the invention and further comprising a 
35 suitable carrier or adjuvant. The immunogenicity of a .capsular 
antigen provided by the invention is for example increased by 
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linking to a carrier (such as a carrier protein), allowing the 
recruitment of T-cell help in developing an immune response. 

The invention further provides a recombinant micro- 
organism provided with at least a part of a capsular gene 
5 cluster derived from Streptococcus suis. The invention 

provides for example a lactic acid bacterium provided with at 
least a part of a capsular gene cluster derived from 
Streptococcus suis. Various food-grade lactic acid bacteria 
(Lactococcus lactis, Lactobacillus casei, Lactobacillus 

10 plantarium and Streptococcus gordonii) have been used as 
delivery systems for mucosal immunization. It has now been 
shown that oral (or mucosal) administration of recombinant L. 
lactis, Lactobacillus, and Streptococcus gordonii can elicit 
local IgA and /or IgG antibody responses to an expressed 

15 antigen. The use of oral routes for immunization against 
infective diseases is desirable because oral vaccines are 
easier to administer, have higher compliance rates, and 
because mucosal surfaces are the portals of entry for many 
pathogenic microbial agents. It is within the skill of the 

20 artisan to provide such micro-organisms with (additional) 
genes . 

The invention further provides a recombinant 
Streptococcus suis mutant provided with a modified capsular 
gene cluster. It is within the skill of the artisan to swap 

25 genes within a species. In a preferred embodiment, an 

avirulent Streptococcus suis mutant is selected to be provided 
with at least a part of a modified capsular gene cluster 
according to the invention. 

The invention further provides a vaccine comprising a micro- 

30 organism or a mutant provided by the invention. An advantage 
of such a vaccine over currently used vaccines is that they 
comprise accurately defined micro-organisms and well- 
characterised antigens, allowing accurate determination of 
immune responses against various antigens of choice. 

35 The invention is further explained in the experimental part 
of this description without limiting the invention thereto. 
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Experimental part 
MATERIAL AND METHODS 

Bacterial strains and growth conditions. 

The bacterial strains and plasmids used in this study are 
listed in Table 1. S. suis strains were grown in Todd-Hewitt 
broth (code CM189, Oxoid) , and plated on Columbia agar blood 
base (code CM331, Oxoid) containing 6% (v/v) horse blood. 
E.cbli strains were grown in Luria broth (28) and plated on 
Luria broth containing 1.5% (w/v) agar. If required, 
antibiotics were added to the plates at the following 
concentrations: spectinomycin: 100 ug/ml for S. suis and 50 
ug/ml for E. coli and ampicillin, 50 ug/ml. 
Serotyping. The S.suis strains were serotypes by the slide 
agglutination test with serotype-specif ic antibodies (44). 
DNA techniques. Routine DNA manipulations were performed as 
described by Sambrook et al. (36). 

Alkaline phosphatase activity. To screen for PhoA fusions in 
E.coli, plasmid libraries were constructed. Therefore, 
chromosomal DNA of S. suis type 2 was digested with AIuI. The 
300-500-bp fragments were ligated to Sinai-digested pPHOS2. 
Ligation mixtures were transformed to the PhoA" E. coli strain 
CC118. Transf ormants were plated on LB media supplemented with 
5-Bromo-4~chloro-3-indolylfosf aat (BCIP, 50 ug/ml, Boehringer, 
Mannheim, Germany) . Blue colonies were purified on fresh 
LB/BCIP plates to verify the blue phenotype. 

DNA sequence analysis. DNA sequences were determined on a 37 3A 
DNA Sequencing System (Applied Biosystems, Warrington, GB) . 
Samples were prepared by use of a ABI /PRISM dye terminator 
cycle sequencing ready reaction kit (Applied Biosystems) . 
Sequencing data were assembled and analyzed using the 
MacMollyTetra program. Custom-made sequencing primers were 
purchased from Life Technologies. Hydrophobic stretches within 
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proteins were predicted by the method of Klein et al. (17). The 
BLAST program available on Netscape Navigator™ was used to 
search for protein sequences related to the deduced amino acid 
sequences . 

5 Construction of gene-specific knock-out mutants of S. suis. To 

construct the mutant strains lOcpsB and lOcpsEF we 
eiectrotransformed the pathogenic serotype 2 strain 10 
(45, 49) of S. suis with pCPSll and pCPS28 respectively. In 
these plasmids the cpsB and cpsEF genes were disturbed by the 

10 insertion of a spectinomycin-resistance gene. To create pCPSll 
the internal 400 bp Pstl-BamHI fragment of the cpsB gene in 
pCPS7 was replaced by the Spc R gene. For this purpose pCPS7 was 
digested with PstI and BamHI and ligated to the 1,200-bp Psti- 
BamRI fragment, containing the Spc R gen, from pIC-spc. To 

15 construct pCPS28 we have used pIC20R. In this plasmid we 
inserted the KpnI-SalX fragment from pCPS17 (resulting in 
pCPS25) and the Xbal-Clal fragment from pCPS20 (resulting in 
pCPS27) . pCPS27 was digested with PstI and Xhol and ligated to 
the 1,200-bp Pstl-Xhol fragment, containing the Spc R gene of 

20 pIC-spc. The electrotransf ormation to S. suis was carried out 
as described before (38) . 

Southern blotting and hybridization. Chromosomal DNA was 
isolated as described by Sambrook et al. (36). DNA fragments 
were separated on 0.8% agarose gels and transferred to Zeta- 

25 Probe GT membranes (Bio-Rad) as described by Sambrook et al. 
(36). DNA probes were labelled with [( - 32 P]dCTP (3000 Ci 
mmol" 1 ; Amersham) by use of a random primed labelling kit 1 
(Boehringer) . The DNA on the blots was hybridized at 65°C with 
appropriate DNA probes as recommended by the supplier of the 

30 Zeta-Probe membranes. After hybridization, the membranes were 
washed twice with a solution of 40 mM sodium phosphate, pH 7.2, 
1 mM EDTA , 5% SDS for 30 min at 65°C and twice with a solution 
of 40 mM sodium phosphate, pH 7.2, 1 mM EDTA, 1% SDS for 30 min 
at 65°C. 

35 PCR. The primers used in the cps2J PCR correspond to the 
positions 13791-13813 and 14465-14443 in the S. suis cps2 
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locus. The sequences were: 5' -CAAACGCAAGGAATTACGGTATC-3' and 
5' -GAGTATCTAAAGAATGCCTATTG-3' . The primers used for the cpsll 
PCR correspond to the positions 4398-4417 and 4839-4821 in the 
S. suis cpsl sequence. The sequences were: 5'- 
5 GGCGGTCTAGCAGATGCTCG-3 ' and 5' -GCGAACTGTTAGCAATGAC-3' . The 
primers used in the cps9H PCR correspond to the positions 
4406-4126 and 4494-4475 in the S. suis cps9 sequence. The 
sequences were: 5' -GGCTACATATAATGGAAGCCC3' and 5'- 
CGGAAGTATCTGGGCTACTG- 3 ' . 
10 Construction of gene-specific knock-out mutants of S. suis. To 
construct the mutant strains lOcpsB. and lOcpsEF we 
electrotransf ormed the pathogenic serotype 2 strain 10 
of S. suis with pCPSll and pCPS28 respectively. In these 
plasmids the cpsB and cpsEF genes were disturbed by the 
15 insertion of a spectinomycin-resistance gene. To create pCPSll 
the internal 400 bp Pstl-BamHI fragment of the cpsB gene in 
pCPS7 was replaced by the Spc R gene. For this purpose pCPS7 was 
digested with Pstl and BamHI and ligated to the 1,200-bp Pstl- 
BamRI fragment, containing the Spc R gen, from pIC-spc. To 
20 construct pCPS28 we have used pIC20R. In this plasmid we 
inserted the KpnI-Sall fragment from pCPS17 (resulting in 
pCPS25) and the Xbal-Clal fragment from pCPS20 (resulting in 
pCPS27) . pCPS27 was digested with PstI and Xhol and ligated to 
the 1,200-bp Pstl-Xhol fragment, containing the Spc R gene of 
25 pIC-spc. The electrotransf ormation to 5. suis was carried out 
as described before (38) . 

Phagocytosis assay. Phagocytosis assays were performed as 
described by Leij et al. (23) . Briefly, to opsonize the cells, 
10 7 S. suis cells were incubated with 6% SPF-pig serum for 30 

30 min at 37°C in a head-over-head rotor at 6 rpm. 10^ AM and 10 7 
opsonized S. suis cells were combined and incubated at 37°C 
under continuous rotation at 6 rpm. At 0, 30, 60 and 90 min, 1- 
ml samples were collected and mixed with 4 ml of ice-cold EMEM 
to stop phagocytosis. Phagocytes were removed by centrif ugation 

35 for 4 min at 110 x g and 4°C. The number of colony forming 
units (CFU) in the supernatants was determined. Control 
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histologically as described before (45, 49) . Colonization of 
the serosae was scored positively when S. suis was isolated 
from the pericardium, thoracal pleura or the peritoneum. 
Colonization of the joints was scored positively when S. suis 
5 was isolated from one or more joints (12 joints per animal were 
scored) . 

Vaccination and challenge 

One week old pigs were vaccinated intravenously with a dosage 
of 106 cfu of the S. suis strains lOcpsEF or lOcpsB. Three 

10 weeks later the pigs were challenged intravenously with the 
pathogenic serotype 2 strain 10 (107 cfu) . Disease monitoring, 
haematological, serological and bacteriological examinations as 
well as post-mortum examinations were as described before under 
experimental infections . 

15 Electron Microscopy, Bacteria were prepared for electron 
microscopy as described by Wagenaar et al. (50). Shortly, 
bacteria were mixed with agarose MP (Boehringer) of 37° C to a 
concentration of 0,7%. The mixture was immediately cooled on 
ice. Upon gelifying, samples were cut into 1 to 1.5 mm slices 

20 and incubated in a fixative containing 0.8% glutaraldehyde and 
0.8% osmiumtetraoxide. Subsequently, the samples were fixed 
and stained with uranyl acetate by microwave stimulation, 
dehydrated and imbedded in eponaraldite resin. Ultra-thin 
sections were counterstained with lead citrate and examined 

25 with a Philips CM 10 electron microscope at 80 kV. 

Isolation of porcine alveolar macrophages (AM) . Porcine AM were 
obtained from the lungs of specific pathogen free- (SPF) pigs. 
Lung lavage samples were collected as described by van Leengoed 
et al. (43) . Cells were suspended in EMEM containing 6% (v/v) 

30 SPF-pig serum and adjusted to 10 7 cells per ml. 
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Identification of the cps locus. 

The cps locus of S.suis type 2 was identified by making use of 
5 a strategy developed for the genetic identification of exported 
proteins (13, 31) . In this system we made use of a plasmid 
(pPHOS2) containing a truncated alkaline phosphatase gene (13) . 
The gene lacked the promoter sequence, the translational start 
site and the signal sequence. The truncated gene is preceded by 

10 a unique Smal restriction site. Chromosomal DNA of S. suis type 
2, digested with AIuI, was randomly cloned in this restriction 
site. Because translocation of PhoA across the cytoplasmic 
membrane of E. coli is required for enzymatic activity, the 
system can be used to select for S. suis fragments containing a 

15 promoter sequence, a translational start site and a functional 
signal sequence. Among 560 individual £. coli clones tested, 16 
displayed a dark blue phenotype when plated on media containing 
BCIP. DNA sequence analysis of the inserts from several of 
these plasmids were performed (results not shown) and the 

20 deduced amino acid sequences were analyzed. The hydrophobicity 
profile of one of the clones (pPHOS7, results not shown) showed 
that the N-terminal part of the sequence resembled the 
characteristics of a typical signal peptide: a short 
hydrophilic N-terminal region is followed by a hydrophobic 

25 region of 38 amino acids. These data indicate that the phoA 
system was successfully used for the selection of S. suis 
genes encoding exported proteins. Moreover, the sequences were 
analyzed for similarities present in the databases. The 
sequence of pPH0S7 showed a high similarity (37% identity) with 

30 the protein encoded by the cpsl4C gene of Streptococcus 

pneumoniae (19) . This strongly suggests that pPH0S7 contains a 
part of the cps operon of S. suis type 2. 
Cloning of the flanking cps genes. In order to clone the 
flanking cps genes of S. suis type 2 the insert of pPHOS7 was 

35 used as a probe to identify chromosomal DNA fragments which 
contain flanking cps genes. A 6-kb Hindi I I fragment was 
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identified and cloned in pKUN19. This yielded clone pCPS6 (Fig. 
1C) . Sequence analysis of the insert of pCPS6 revealed that 
pCPS6 most probably contained the 5' -end of the cps locus, but 
still lacked the 3 '-end. Therefore, sequences of the 3' -end of 
5 pCPS6 were in turn used as a probe to identify chromosomal 

fragments containing cps sequences located further downstream. 
These fragments were also cloned in pKUN19, resulting in 
pCPS17. Using the same system of chromosomal walking we 
subsequently generated the plasmid pCPS18, pCPS20, pCPS23 and 

10 pCPS26, containing downstream cps sequences. 

Analysis of the cps operon. The complete nucleotide sequence of 
the cloned fragments was determined (figure 4) . Examination of 
the compiled sequence revealed the presence of at least 13 
potential open reading frame (Orfs), which were designated as 

15 Orf 2Y, Orf2X and Cps2A-Cps2K (Fig. 1A) . Moreover, a 14th, 
incomplete, Orf (Orf 22) was located at the 5 1 -end of the 
sequence. Two potential promoter sequences were identified. One 
was located 313 bp (locations 1885-1865 and 1884-1889) 
upstream of Orf2X. The other potential promoter sequence was 

20 located 68 bp upstream of Orf2Y (locations 2241-2236 and 2216- 
2211) . Orf2Y is expressed in opposite orientation. Between Orfs 
2Y and 2Z the sequence contained a potential stem-loop 
structure, which could act as a transcription terminator. Each 
Orf is preceded by a ribosome-binding site and the majority of 

25 the Orfs are very closely linked. The only significant 
intergenic gap was found between Cps2G and Cps2H (389 
nucleotides) . However, no obvious promoter sequences or 
potential stem-loop structures were found in this region. These 
data suggest that Orf2X and Cps2A-Cps2K are arranged as an 

30 operon. 

An overview of all Orfs with their properties is shown in 
Table 2. The majority of the predicted gene products is related 
to proteins involved in polysaccharide biosynthesis. Orf2Z 
showed some similarity with the YitS protein of Bacillus 
35 sujbtilis. YitS was identified during the sequence analysis of . 
the complete genome of B. subtilis. The function of the protein 
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is unknown. 

Orf2Y showed similarity with YcxD protein of B. subtilis 
(53) . Based on the similarity between YcxD and MocR of 
Rhizobium meliloti (33), YcxD was suggested to be a regulatory 
5 protein. 

Orf2X showed similarity with the hypothetical YAAA proteins 
of Haemophilus influenzae and E. coli. The function of these 
proteins is unknown. 

The gene products encoded by the cps2A, cps2B, cps2C and 

10 cps2D genes showed approximate similarity with the CpsA, CpsC, 
CpsD and CpsB proteins of several serotypes of Streptococcus 
pneumoniae (19), respectively. This suggest similar functions 
for these proteins. Hence, Cps2A may have a role in the 
regulation of the capsular polysaccharide synthesis. Cps2B and 

15 Cps2C could be involved in the chain length determination of 

the type 2 capsule and Cps2C can play an additional role in the 
export of the polysaccharide. The Cps2D protein of S. suis is 
related to the CpsB protein of S. pneumoniae and to proteins 
encoded by genes of several other Gram-positive bacteria 

20 involved in polysaccharide or exopolysaccharide synthesis, but 
their function is unknown (19) . 

The protein encoded by cps2E gene showed similarity to 
several bacterial proteins with glycosyl transferase 
activities: Cpsl4E and Cpsl9fE of S. pneumoniae serotypes 14 

25 and 19F (18, 19, 29), CpsE of Streptococcus salvarius (X94980) 
and CpsD of Streptococcus agalactiae (34) . Recently, Kolkman et 
al. (18) showed that Cpsl4E is a glucosyl-l~phosphate 
transferase that links glucose to a lipid carrier, the first 
step in the biosynthesis of the S. pneumoniae type 14 repeating 

30 unit. Based on these data a similar function may be fulfilled 
by Cps2E of S. suis . 

The protein encoded by the cps2F gene showed similarity to 
the protein encoded by the rfbU gene of Salmonella enteritica . 
(25) . This similarity is most pronounced in the C-terminal 

35 regions of these proteins. The rfbU gene was shown to encoded 
mannosyltransf erase activity (25) . 
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The cps2G gene encoded a protein that showed moderate 
similarity with the rfbF gene product of Campylobacter hyoilei 
(22), the epsF gene product of S. thermophilus (40) and the 
capM gene product of S. aureus (24) . On the basis of 
5 similarity the rfbF, epsF and capM genes are suggested to 
encoded galactosyltransf erase activities. Hence, a similar 
glycosyl transferase activity could be fulfilled by the cps2G 
gene product. 

The cps2H gene encodes a protein that is similar to the N- 
10 terminal region of the lgtD gene product of Haemophilus 

influenzae (U32768) . Moreover, the hydrophobicity plots of 
Cps2H and LgtD looked very similar in these regions (data not 
shown) . Based on sequence similarity the lgtD gene product was 
suggested to have glycosyl transferase activity (U32768) . 
15 The gene product encoded by the cps2I gene showed some 
similarity with a protein of Actinobacillus 

actinomycetemcomitans (AB002668) . This protein is part of the 
gene cluster responsible for the serotype-b-specif ic antigen of 
A. actimycetemcomitans. The function of the protein is unknown. 

20 The gene products encoded by the cps2J and cps2K genes 

showed significant similarities to the Cpsl4J protein of S. 
pneumoniae . The cpsl4J gene of S. pneumoniae was shown to 
encode a G-l, 4-galactosyltransf erase activity. In S. 
pneumoniae CpsJ is responsible for the addition of the fourth 

25 (i.e. last) sugar in the synthesis of the S. pneumoniae 

serotype 14 polysaccharide (20) . Even some similarity was 
found between Cps2J and Cps2K (Fig. 2, 25.5% similarity). This 
similarity was most pronounced in the N-terminal regions of the 
proteins. Recently, two small conserved regions were identified 

30 in the N-terminus of Cpsl4J and Cpsl4I and their homologues 
(20) . These regions were predicted to be important for 
catalytic activity. Both regions, DXS and DXDD (Fig. 2), were 
also found in Cps2J and Cps2K. 
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Distribution of the cps2 genes in other S. suis serotypes. To 

examine the relationship between the cps2 genes and cps genes 
in the other S. suis serotypes, we performed cross- 
hybridization experiments. DNA fragments of the individual 
5 cps2 genes were amplified by PCR, labelled with 32 P, and used 
to probe Southern blots of chromosomal DNA of the reference 
strains of the 35 different S. suis serotypes. Large variation 
in the hybridization patterns were observed (Table 4). As a 
positive control we used a probe specific for 16S rRNA. The 

10 16S rRNA probe hybridized with all serotypes tested. However, 
none of the other genes tested were common in all serotypes. 
Based on the genetic organization of the genes we previously 
suggested that orfX and cpsA-cpsK genes are part of one operon 
and that the protein encoded by these genes are all involved 

15 in polysaccharide biosynthesis. OrfY and OrfZ are not a part 
of this operon, and their role in the polysaccharide 
biosynthesis is unclear. Based on sequence similarity data, 
OrfY may be involved in regulation of the cps2 genes. OrfZ is 
proposed to be unrelated to polysaccharide biosynthesis. 

20 Probes specific for the orfZ f orfY, orfX, cpsA, cpsB, cpsC and 
cpsD genes hybridized with most other serotypes. This suggests 
that the protein encoded by these genes are not type-specific, 
but may perform more common functions in biosynthesis of the 
capsular polysaccharide. This confirms previous data which 

25 showed that the cps2A-cps2D genes showed strong similarity to 
cps genes of several serotype of Streptococcus pneumoniae. 
Based on this similarity Cps2A is possibly a regulatory 
protein, whereas Cps2B and Cps2C may play a role in length 
determination and export of polysaccharide. The cps2E gene 

30 hybridized with DNA of serotypes 1, 2, 14 and 1/2. The cps2E 
gene showed a strong similarity to the cpsl4E gene of S. 
pneumoniae (18) . This enzyme was shown to have a glucosyl-1- 
phosphate activity and catalyzed the transfer of glucose to a 
lipid carrier (18) . These data indicate that a 

35 glycosyltransf erase closely related to Cpsl4E may be 
responsible for the first step in the biosynthesis of 
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polysaccharide in the S. suis serotypes 1, 2, 14 and 1/2. The 
cps2F r cps2G, cps2H, cps2I and cps2J genes hybridized with 
chromosomal DNA of serotypes 2 and 1/2 only. The cps2G gene 
showed an additional weak hybridization signal with DNA of 
5 serotype 34. In agglutination tests serotype 1/2 showed 
agglutination with sera specific for serotype 2 as well as 
with sera specific for serotype 1. This suggests that serotype 
1/2 shares antigenic determinants with both types 1 and 2. The 
hybridization data confirmed these data. All putative 

10 glycosyltransf erases present in serotype 2 are also present in 
serotype 1/2. The cps2K gene showed a similar hybridization 
pattern as the cps2E gene. Hybridization was observed with DNA 
of serotypes 1, 2, 14 and 1/2. Taken together these 
hybridization data show that the cps2 gene cluster can be 

15 divided in three regions: a central region containing the 
type-specific genes is flanked by two regions containing 
common genes for various serotypes. 

Cloning of the type-specific cps genes of serotypes 1 and 9. 

20 To clone the type-specific cps genes of S. suis serotype 1 we 
used the cps2E gene as a probe to identify chromosomal DNA 
fragments of type 1 which contain flanking cps genes. A 5 kb 
EcoRV fragment was identified and cloned in pKUN19. This 
yielded pCPSl-1 (Fig. IB). This fragment was in. turn used as a 

25 probe to identify an overlapping 2.2 kb Hindlll fragment. 

pKUN19 containing this Hindlll fragment was designated pCPSl- 
2. The same strategy was followed to identify and clone the 
type-specific cps genes of serotype 9. In this case, we used 
the cps2D gene as a probe. A 0.8 kb Hindlll-Xbal fragment was 

30 identified and cloned, yielding pCPS9-l (Fig. 1C) . This 

fragment was in turn used as a probe to identify a 4 kb Xbal 
fragment. pKUN19 containing this 4 kb Xbal fragment was 
designated pCPS9-2. 
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Analysis of the cloned cpsl genes. The complete nucleotide 
sequence of the inserts of pCPSl-1 and pCPSl-2 was determined 
(figure 5) . Examination of the sequence revealed the presence 
of five complete and two incomplete Orfs (Fig. IB), Each Orf 
5 is preceded by a ribosome-binding site. In accord with data 
obtained for the cps2 genes of serotype 2, the majority of the 
Orfs is very closely linked. The only significant gap (718 bp) 
was found between CpslG and CpslH. No obvious promoter 
sequences or potential stem-loop structures could be found in 
10 this region. This suggests that, as in serotype 2, the cps 
genes in serotype 1 are arranged in an operon. 

An overview of the Orfs and their properties in shown in 
Table 2. As expected on the basis of the hybridization data 
(Table 4), the protein encoded by the cpslE gene was related 
15 to Cps2E of S. suis type 2 (identity of 86%) . The fragment 
cloned in pCPSl-1 lacked the coding region for the first 7 
amino acids of the cpslE gene. 

The protein encoded by the cpslF and cpslG genes showed, 
strong similarity to the Cpsl4F and Cpsl4G proteins of 
20 Streptococcus pneumoniae serotype 14, respectively (20) . The 
function of the Cpsl4F is not completely clear, but it has 
been suggested that Cpsl4F can enhance role in 

glycosyltransf erase activity. The cpsl4G gene of S. pneumoniae 

was shown to encode 6-1, 4-galactosyltransf erase activity. In 
25 S. pneumoniae type 14 this activity is required for the second 

step in the biosynthesis of the oligosaccharide subunit (20) . 

Based on the similarity data found similar glycosyltransf erase 

and enhancing activities are suggested for the cps 1G and 

cpslF genes of S. suis type 1. 
30 The protein encoded by the cpslH gene showed similarity to 

the Cpsl4H protein of S. pneumoniae (20) . Based on sequence 

similarity Cpsl4H was proposed to be the polysaccharide 

polymerase (20) . 

The protein encoded by the cpsll gene showed some 
35 similarity with the Cpsl4J protein of S. pneumoniae (19). The 

cps!4J gene was shown to encode a ft-1, 4-galactosyltransf erase 
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activity, responsible for the addition of the fourth (i.e. 
last) sugar in the synthesis of the S, pneumoniae serotype 14 
polysaccharide . 

Between CpslG and CpslH a gap of 718 bp was found. This 

5 region revealed three small Orfs. The three Orfs were 
expressed in three different reading frames and were not 
preceded by potential ribosome binding sites, nor contained 
potential start sites. However, the three potential gene 
products encoded by this region showed some similarity with 

10 three successive regions of the Oterminal part of the EpsK 
• protein of Streptococcus thermophilus (27% identity, 40) . The 
region related to the first 82 amino acids is lacking. 



Analysis of the cloned cps9 genes. We also determined the 
15 complete nucleotide sequence of the inserts of pCPS9-l and 

pCPS9-2 (figure 6) . Examination of the sequence revealed the 

presence of three complete and two incomplete Orfs (Fig.lC). 

As in serotypes 1 and 2, all Orfs are preceded by a ribosome- 

binding site and are very closely coupled. As suggested by the 
20 hybridization data (Table 4) the Cps2D and Cps9D proteins were 

highly related (Table 2) . Based on sequence comparisons pCPS9- 

1 lacked the first 27 amino acids of the Cps9D protein. 
The protein encoded by the cps9E gene showed some 

similarity with the CapD protein of Staphylococcus aureus 
25 serotype 1 (24) . Based on sequence similarity data the CaplD 

protein was suggested to be an epimerase or a dehydratase 

involved in the synthesis of N-acetylf ructosamine or N- 

acetylgalactosamine (63) . 

Cps9F showed some similarity to the CapM proteins of S. 
30 aureus serotypes 5 and 8 (61, 64, 65). Based on sequence 

similarity data Cap5M and Cap8M are proposed to be 

glycosyltransf erases (63) . 

The protein encoded by the cps9G gene showed some 

similarity with a protein of Actinobacillus 
35 actinomycetemcomitans (AB002668_4 ) . This protein is part of a 

gene cluster responsible for the serotype-b specific antigens 
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of Actinobacillus actinomycetemcomitans . The function of the 
protein is unknown. 

The protein encoded by the cps9H gene showed some 
similarity with the rfbB gene of Yersinia enterolitica (68) . 
5 The RfbB protein was shown to be essential for O-antigen 

synthesis, but the function of the protein in the synthesis 
of the 0:3 lipopolysaccharide is unknown. 

Serotype 1 and serotype 9 specific cps genes . To determine 

10 whether the cloned fragments in pCPSl-1, pCPSl-2, pCPS9-l and 
pCPS9-2 contained the type-specific genes for serotype 1 and 
9, respectively, cross hybridization experiments were 
performed. DNA fragments of the individual cpsl and cps9 genes 
were amplified by PCR, labelled with 32 P, and used to probe 

15 Southern blots of chromosomal DNA of the reference strains of 
the 35 different S. suis serotypes. The results are shown in 
Table 5. Based on the data obtained with the cps2E probe 
(Table 4), the cpslE probe was expected to hybridize with 
chromosomal DNA of S. suis serotypes 1,2, 14, 27 and 1/2. The 

20 cpslH, cps9E and cps9F probes hybridized with most other 
serotypes. However, the cpslF and cpslG and cpsll probes 
hybridized with chromosomal DNA of serotypes 1 and 14 only. 
The cps9G and cps9H probe hybridized with serotype 9 only. 
These data suggest that the cps9G and cps9H probes are 

25 specific for serotype 9 and therefore could be useful tools 
for the development of rapid and sensitive diagnostic tests 
for S. suis type 9 infections. 

Type specific PCR. So far, the probes were tested on the 35 
30 different reference strains only. To test the diagnostic value 
of the type-specific cps probes further, several other S. suis 
serotype 1, 2, 1/2, 9 and 14 strains were used. Moreover, 
since a PCR based method would be even more rapid and 
sensitive than a hybridization test, we tested whether we 
35 could use a PCR for the serotyping of the S. suis strains. The 
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Capsular mutants are sensitive to phagocytosis and killing by 
porcine alveolar macrophages (PAM) . 

The capsular mutants were tested for their ability to resist 
phagocytosis by PAM in the presence of porcine SPF serum. The 
5 wild type strain 10 seemed to be resistant to phagocytosis 
under these conditions (Fig. 4A) . In contrast, the mutant 
strains were efficiently ingested by macrophages (Fig. 4A) . 
After 90 min. more than 99.7% (strain lOcpsB) and 99.8% (strain 
lOcpsEF) of the mutant cells were ingested by the macrophages. 

10 Moreover, as shown in Fig. 4B the ingested strains were 

efficiently killed by the macrophages. 90-98% of all ingested 
cells were killed within 90 min. No differences could be 
observed between wild type and mutant strains. These data 
indicate that the capsule of S. suis type 2 efficiently 

15 protects the organism from uptake by macrophages in vitro. 



Capsular mutants are less virulent for germfree piglets. The 

virulence properties of the wild-type and mutant strains were 
tested after experimental infection of newborn germfree pigs 

20 (45, 49) . Table 1 shows that specific and nonspecific signs of 
disease could be observed in all pigs inoculated with the wild 
type strain. Moreover , all pigs inoculated with the wild type 
strain died during the course of the experiment or were killed 
because of serious illness or nervous disorders (Table 3) . In 

25 contrast, the pigs inoculated with strains lOcpsB and lOcpsEF 
showed no specific signs of disease and all pigs survived until 
the end of the experiment. The temperature of the pigs 
inoculated with the wild type strain increased 2 days after 
inoculation and remained high until day 5 (Table 3) . The 

30 temperature of the pigs inoculated with the mutant strains 
sometimes exceeded the 40°C, however, we could observe 
significant differences in the fever index [i.e % of 
observations in an experimental group during which pigs showed 
fever (>40°C) ] between pigs inoculated with wild type and 

35 mutant strains. All pigs showed increased numbers of. 

polymorphonuclear leucocytes (PMLs) (>10 x 10 9 PMLs per litre) 
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(Table 3) . However, in pigs inoculated with the mutant strains 
the percentage of samples with increased numbers of PMLs was 
considerably lower. S. suis strains and B. bronchiseptica could 
be isolated from the nasopharynx and feces swab samples of all 
5 pigs from 1 day post-infection until the end of the experiment 
(Table 3) . Postmortem, the wild type strain could frequently be 
isolated from the central nervous system (CNS), kidney, heart, 
liver , spleen, serosae, joints and tonsils. Mutant strains 
could easily be recovered form the tonsils, but were never 

10 recovered from the kidney, liver or spleen. Interestingly, low 
numbers of the mutant strains yere isolated- from the CNS, the • 
serosae, the joints, the lungs and the heart. Taken together, 
these data strongly indicated that mutant S. suis strains, 
impaired in capsule production, are not virulent for young 

15 germfree pigs. 

We describe the identification and the molecular 
characterisation of the cps locus, involved in the capsular 
polysaccharide biosynthesis, of S. suis Most of the genes 
seemed to belong to a single transcriptional unit, suggesting a 

20 co-ordinate control of these genes. We assign functions to most 
of the gene products. We thereby identified regions involved in 
regulation (Cps2A) , chain length determination (Cps2B, C) , 
export (Cps2C) and biosynthesis (Cps2E, F, G, H, J, K) . The 
region involved in biosynthesis is located at the centre of the 

25 gene cluster and is flanked by two regions containing genes 
with more common functions. The incomplete orf2Z gene was 
located at the 5* -end of the cloned fragment. Orf2Z showed some 
similarity with the YitS protein of B. subtilis . However, 
because the function of the YitS protein is unknown this did 

30 not give us any information about the possible function of 

Orf2Z. Because the orf2Z gene is not a part of the cps operon, 
a role of this gene in polysaccharide biosynthesis is not 
expected. The Orf2Y protein showed some similarity with the 
YcxD protein of B.subtilis (53). The YcxD protein was suggested 

35 to be a regulatory protein. Similarly, Orf 2Y may be involved in 
the regulation of polysaccharide biosynthesis. The Orf2X 
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protein showed similarity with the YAAA proteins of H. 
influenzae and E. coli. The function of these proteins is 
unknown. In S. suis type 2 the orf2X gene seemed to be the 
first gene in the cps2 operon. This suggests a role of Orf2X in 
5 the polysaccharide biosynthesis. In H. influenzae and E. coli, 
however, these proteins are not associated with capsular gene 
clusters. The analysis of isogenic mutants impaired in the 
expression of Orf2X should give more insight in the presumed 
role of Orf2X in the polysaccharide biosynthesis of S. suis 
10 type 2. 

• The gene products encoded by the cps2E, cps2F, cps2G, cps2H, 
cps2J and cps2K genes showed little similarity with 
glycosyltransferases of several Gram-positive or Gram-negative 
bacteria (18, 19, 20, 22, 25) . The cps2E gene product shows 

15 some similarity with the Cpsl4E protein of S. pneumoniae (18, 
19) . Cpsl4E is a glucosyl-l-phosphate transferase that links 
glucose to a lipid carrier (18) . In S. pneumoniae this is the 
first step in the biosynthesis of the oligosaccharide repeating 
unit. The structure of the S. suis serotype 2 capsule contains 

20 glucose, galactose, rhamnose, N-acetyl glucoseamine and sialic 
acid in a ratio of 3:1:1:1:1 (7), Based on these data we 
conclude that Cps2E of S. suis has glucosyltransf erase 
activity, and is involved in the linkage of the first sugar to 
the lipid carrier. 

25 The C-terminal region of the cps2F gene product showed some 

similarity with the RfbU of Salmonella enteritica. RfbU was 
shown to have mannosyltransf erase activity (24) . Because 
mannosyl is not a component of the S. suis type 2 
polysaccharide a mannosyltransf erase activity is not expected 

30 in this organism. Nevertheless, cps2F encodes a 

glycosyltransferase with another sugar specificity. 

Cps2G showed moderate similarity to a family of gene 
products suggested to encode galactosyltransf erase activities 
(22, 24, 40) . Hence a similar activity is shown for Cps2G. 

35 Cps2H showed some similarity with LgtD of H. influenzae 

(U32768) . Because LgtD was proposed to have glycosyltransferase 
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activity , a similar activity is fulfilled by Cps2H. 

Cps2J and Cps2K showed similarity to Cpsl4J of S. pneumoniae 
(20). Cps2J showed similarity with Cpsl4I of S. pneumoniae as 
well. Cpsl4I was shown to have N-acetyl glucosaminyltransf erase 
5 activity, whereas Cpsl4J has a fl-1, 4-galactosyltransf erase 
activity (20) . In S. pneumoniae Cpsl4I is responsible for the 
addition of the third sugar and Cpsl4J for the addition of the 
last sugar in the synthesis of the type 14 repeating unit 
(20) . Because the capsule of S. suis type 2 contains galactose 

10 as well as N-acetyl glucosamine components, 
- galactosyltransferase as well as N-acetyl 

glucoaminyltransf erase activities could be envisaged for the 
cps2J and cps2K gene products, respectively. As was observed 
for Cpsl4I and Cpsl4J, the N-termini of Cps2J and Cps2K showed 

15 a significant degree of sequence similarity. Within the N- 
terminal domains of Cpsl4I and Cpsl4J, two small regions were 
identified, which were also conserved in several other 
glycosyltransf erases (22) . Within these two regions, two Asp 
residues were proposed to be important for catalytic activity. 

20 The two conserved regions, DXS and DXDD, were also found in 
Cps2J and Cps2K. 

The function of Cps2I remains unclear. Cps2I showed some 
similarity with a protein of A. actinomycetemcomitans . Although 
this protein part is of the gene cluster responsible for the 

25 serotype-B-specif ic antigens, the function of the protein is 
unknown . 

We further describe the identification and characterization 
of the cps genes specific for S. suis serotypes 1, 2 and 9. 
After the entire cps2 locus of 5. suis serotype 2 was cloned 

30 and characterized, functions for most of the cps2 gene 

products could be assigned by sequence homologies. Based on 
these data the glycosyltransf erase activities, required for 
type specificity, could be located in the centre of the 
operon. Cross-hybridization experiments, using the individual 

35 cps2 genes as probes on chromosomal DNAs of the 35 different 
serotypes, confirmed this idea. The regions containing the 
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type-specific genes of serotypes 1 and 9 could be cloned and 
characterized, showing that an identical genetic organization 
of the cps operons of other S. suis serotypes exists. The 
cpslE, cpslF, cpslG, cpslH, and cpsll genes revealed a 
5 striking similarity with cpsl4 E, cpsl4F, cpsl4G, cpsl4H and 
cpsl4J genes of S. pneumoniae. Interestingly, S. pneumoniae 
serotype 14 is the serotype most commonly associated with 
pneumococcal infections in young children (54) , whereas S. 
suis serotype 1 strains are most commonly isolated from 
10 piglets younger than 8 weeks (4 6) . In S. pneumoniae the 
cps!4E f cpsl4G, cps 14 1 and cps 14 J encode the 

glycosyltransferases required for the synthesis of the type 14 
tetrameric repeating unit, showing that the cpslE f cpslG and 
cpsll genes encoded glycosyltransferases. The precise 

15 functions of these genes as well as the substrate 

specificities of the enzymes can be established. In S. 
pneumoniae the cpsl4E gene was shown to encode a glucosyl-1- 
phosphate transferase catalyzing the transfer of glucose to a 
lipid carrier. Moreover, cps£-like genes were found in S. 

20 pneumoniae serotypes 9N, 13, 14, 15B, 15C, 18F, 18A and 19F 
(60) . CpsE mutants were constructed in the serotypes 9N, 13 , 
14 and 15B. All mutant strains lacked glucosyltransf erase 
activity (60) . Moreover, in all these S. pneumoniae serotypes 
the cpsE gene seemed to be responsible for the addition of 

25 glucose to the lipid carrier. Based on these data we suggest 
that in S. suis type 1 the cpslE gene may fulfil a similar 
function. The structure of the S. suis type 1 capsule is 
unknown, but it is composed of glucose, galactose, N-acetyl 
glucosamine, N-acetyl galactosamine and sialic acid in a ratio 

30 of 1: 2.4: 1: 1:1.4 (5). Therefore a role of a cpsE-like 
glucosyltransferase activity can easily be envisaged. CpsE 
like sequences were also found in serotypes 2, 1/2 and 14. 

For polysaccharide biosynthesis in S. pneumoniae type 14, 
transfer of the second sugar of the repeating unit to the 

35 first -lipid-linked sugar is performed by the gene products of 
cps 14 F and cpsl4G (20). Similar to Cpsl4F and Cpsl4G, the S. 
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suis type 1 proteins CpslF and CpslG may act as one 
glycosyltransferase performing the same reaction. Cpsl4F and 
Cpsl4G of S. pneumoniae showed similarity to the N-terminal 
half and C-terminal half of the SpsK protein of Sphingomonas 
5 (20, 67), respectively. This suggests a combined function for 
both proteins. Moreover, cpsl4F and cpsl4G like sequences were 
found in several serotypes of S. pneumoniae and these genes 
always seemed to exist together (60). The same was observed 
for S. suis type 1. The cpslF and cpslG probes hybridized 

10 with type 1 and type 14 strains. 

According to the similarity- found between the cpslH gene and 
the cps!4H gene of S. pneumoniae (20), cpslH is expected to 
encode a polysaccharide polymerase. 

The protein encoded by the cpsll gene showed some 

15 similarity with the Cpsl4J protein of S. pneumoniae (19) . The 
cpsl4J gene was shown to encode a B-l, 4-galactosyltransf erase 
activity, responsible for the addition of the fourth (i.e. 
last) sugar in the synthesis of the S. pneumoniae serotype 14 
polysaccharide. In S. suis type 2 the proteins encoded by the 

20 cps2J and cps2K genes showed similarity to the Cpsl4J protein. 
However, no significant homologies were found between Cps2J, 
Cps2K and Cpsll. In the N-terminal regions of Cpsl4J and 
Cpsl4I two small conserved regions, DXS and DXDD, were 
identified (19). These regions seemed to be important for 

25 catalytic activity (13) . At the same positions in the sequence 
Cps2I contained the regions DXS and DXED. 

In the region between CpslG and CpslH three small Orfs were 
identified. Since the Orfs were expressed in three different 
reading frames, and did not contain potential start sites, 

30 expression is not expected. However, the three potential gene 
products encoded by this region showed some similarity with 
three successive regions of the C-terminal part of the EpsK 
protein of Streptococcus thermophilus (27% identity, 40) . The 
region related to the first 82 amino acids is lacking. The 

35 EpsK protein was suggested to play a role in the export of the 
exopolysaccharide by rendering the polymerized 
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exopolysaccharide more hydrophobic through a lipid 
modification. These data could suggest that the sequences in 
the region between CpslG and CpslH originated from epstf-like 
sequence. Hybridization experiments showed that this epsK-like 
5 region is also present in other serotype 1 strains as well as 
in serotype 14 strains (results not shown) . 

The function of most of the cloned serotype 9 genes can be 
established. Based on sequence similarity data the cps9E and 
cps9F genes could be glycosyltransferases (61, 24, 63, 64, 

10 65) . Moreover, the cps9G and cps9H genes showed similarity to 
genes located in regions involved in polysaccharide 
biosynthesis, but the function of these genes is unknown (68) . 

Cross-hybridization experiments using the individual cps2, 
cpsl and cps9 genes as probes showed that the cps9G and cps9H 

15 probes specifically hybridized with serotype 9 strains. 

Therefore, these are useful as tools for the identification of 
S. suis type 9 strains both for diagnostic purposes as well as 
in epidemiological and transmission studies. We previously 
developed a PCR method which can be used to detect S. suis 

20 strains in nasal and tonsil swabs of pigs (62) . The method was 
for example used to identify pathogenic (EF-positive) strains 
of S. suis serotype 2 During the last years, beside S. suis 
type 2 strains, serotype 9 strains are frequently isolated 
from organs of diseased pigs. However, until now a rapid and 

25 sensitive diagnostic test was not available for type 9 

strains. Therefore, the type 9 specific probes or the type 9 
specific PCR is of great diagnostic value. The cpslF, cpslG 
and cpsll probes hybridized with serotype 1 as well as with 
serotype 14 strains. In coagglutination tests type 1 strains 

30 ' react with the anti-type 1 as well as with the anti-type 14 
antisera (56) . This suggests the presence of common epitopes 
between these serotypes. On the other hand type 1 strains 
agglutinated only with anti-type 1 serum (56,57), indicating 
that it is possible to detect differences between those 

35 serotypes. 

The cps2F, cps2G, cps2H, cps2I and cps2J probes hybridized 
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with serotypes 2 and 1/2 only. Serotype 34 showed a weak 
hybridizing signal with the cps2G probe. As shown in 
agglutination tests type 1/2 strains react with sera directed 
against type 1 as well as with sera directed against type 2 
5 strains (46) . Therefore, type 1/2 shared antigens with both 
types 1 and 2. Based on the hybridization patterns of serotype 
1/2 strains with the cpsl and cps2 specific genes, serotype 
1/2 seemed to be more closely related to type 2 strains than 
to type 1 strains. In our current studies we identify type- 

10 specific genes, primers or probes which are used for the 

discrimination of serotypes 1, 14 and 2 and 1/2 and others of 
the 35 serotypes yet known. Furthermore, type-specific genes, 
primers or probes can now easily be developed for yet unknown 
serotypes, once they become isolated. 

15 Cloning and characterization of a further part of 

the cps2 locus. 

Based on the established sequence 11 genes, designated 
cps2L to cps2T, orf2U and orf2V , were identified. A gene 
homologous to genes involved in the polymerization of the 

20 repeating oligosaccharide unit (cps20) as well as genes 

involved in the synthesis of sialic acid (cps2P to cps2T) were 
identified. Moreover, hybridization experiments showed that 
the genes involved in the sialic acid synthesis are present in 
S. suis serotype 1, 2, 14, 27 and 1/2. The ,l cps2M" and "cps2N" 

25 regions showed similarity to proteins involved in the 

polysaccharide biosynthesis of other gram-positive bacteria. 
However, these regions seemed to be truncated or were non- 
functional as the result of frame-shift or point mutations. At 
its 3'-end the cps2 locus contained two insertional elements 

30 ("orf2U" and "orf2V") both of which seemed to be non- 
functional . 

To clone the remaining part of the cps2 locus, sequences 
of the 3' -end of pCPS26 (Fig. 1C) were used to identify a 
chromosomal fragment containing cps2 sequences located further 
35 downstream. This fragment was cloned in pKUN19 resulting in 
pCPS29. Using a similar approach we subsequently isolated the 
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plasmids pCPS30 and pCPS34 containing downstream cps2 
sequences (Fig. 1C) . 

Analysis of the cps2 operon. 
5 The complete nucleotide sequence of the cloned fragments 

was determined. Examination of the compiled sequence revealed 
the presence of : a sequence encoding the Oterminal part of 
Cps2K, six apparently functional genes (designated cps20- 
cps2T ) and the remnants of 5 different ancestral genes 

10 (designated "cps2L", "cps2M", ,, cps2N" , "orf2U" and "orf2V") . 
.The latter genes seemed to be truncated or incomplete as the 
result of the presence of stop codons or frame-shift mutations 
(Fig. 1A) . Neither potential promoter sequences nor potential 
stem-loop structures could be identified within the sequenced 

15 region. A ribosome-binding site precedes each ORF and the 

majority of the ORFs is very closely linked. Three intergenic 
gaps were found: one between f, cps2M ,, and "cps2N" (176 
nucleotides) , one between cps20 and cps2P (525 nucleotides) , 
and one between cps2T and . "orf 2U I? (200 nucleotides). These and 

20 our above data show that Orf2X and Cps2A-Orf2T are part of a 
single operon. 

A list of all loci and their properties is shown in Table 
4. The "cps2L" region contained three potential ORFs, of 103, 
79 and 152 amino acids, respectively, which were only 

25 separated from each other by stop codons. Only the first ORF 
is preceded by a potential ribosomal binding site and 
contained a methionine start codon. This suggests that "cps2L" 
originates from an ancestral cps2L gene, which coded for a 
protein of 339 amino acids. The function of this hypothetical 

30 Cps2L protein remains unclear so far: no significant 

homologies were found between Cps2L and proteins present in 
the data libraries. It is not clear whether the first ORF of 
the "cps2L" region is expressed into a protein of 103 amino 
acids. The"cps2M 11 region showed homology to the N-terminal 

35 134 amino acids of the NeuA proteins of Streptococcus 

agalactiae and Escherichia coli (AB017355, 32) . However, 
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although the !t cps2 M" region contained a potential ribosome 
binding site, a methionine start codon was absent. Compared 
with the S. agalactiae sequence, the ATG start codon was 
replaced by a lysin encoding AAG codon. Moreover, the region 
5 homologous to the first 58 amino acids of the S. agalactiae 
NeuA (identity 77%) was separated from the region homologous 
to amino acids 59-134 of NeuA by a repeated DNA sequence of 
100-bp (see later) . In addition, the region homologous to 
amino acids 59 to 95 of NeuA (identity 32%) and the region 

10 homologous to the amino acids 96 to 134 of NeuA (identity 

50%) were present in different reading frames. Therefore, the 
partial and truncated NeuA homologue is probably nonfunctional 
in S. suis. The f, cps2N" region showed homology to CpsJ of S. 
agalactiae (accession no. AB017355) . However, sequences 

15 homologous to the first 88 amino acids of CpsJ were lacking in 
S. suis. Moreover, the homologous region was present in two 
different reading frames. The protein encoded by the cps20 
gene showed homology to proteins of several streptococci 
involved in the transport of the oligosaccharide repeating 

20 unit (accession no. AB017355) , suggesting a similar function 
for Cps20. The proteins encoded by the cps2P, cps2S and cps2T 
genes showed homology to the NeuB, NeuD and NeuA proteins of 
S. agalactiae and E. coli (accession no AB017355) . Because the 
n cps2M n region also showed homology to NeuA of E. coli, the 

25 S. suis cps2 locus contains a functional neuA gene (cps2T) as 
well as a nonfunctional ("cps2M") gene. The mutual homology 
between these two regions showed an identity of 77% at the 
amino acid level over amino acids 1-58 and 4 9% over the amino 
acids 59-134. Cps2Q and Cps2R showed homology to the N- 

30 terminal and C-terminal parts of the NeuC protein of S. 

agalactiae and E. coli, respectively. This suggests that the 
function of the S. agalactiae NeuC protein in S. suis is 
likely fulfilled by two different proteins. In E. coli the 
neu genes are known to be involved in the synthesis of sialic 

35 acid. NeuNAc is synthesized from N-acetylmannosamine and 

phosphoenolpyruvate by NeuNAc synthetase. Subsequently, NeuNAc 
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is converted to CMP-NeuNAc by the enzyme CMP-NeuNAc 
synthetase. CMP-NeuNAc is the substrate for the synthesis of 
polysaccharide. In E. coli Kl NeuB is the NeuNAc synthetase, 
NeuA is the CMP-NeuNAc synthetase. NeuC has been implicated in 
5 the NeuNAc synthesis, but its precise role is not known. The 
precise role of NeuD is not known. A role of the Cps2P-Cps2T 
proteins in the synthesis of sialic acid can easily be 
envisaged, since the capsule of S. suis serotype 2 is rich in 
sialic acid. In S. agalactiae sialic acid has been shown to be 

10 critical to the virulence function of the type III capsule. 
Moreover, it has been suggested. that the presence of sialic 
acid in capsule of bacteria which can cause meningitis may be 
important for the capacity of these bacteria to breach the 
blood-brain barrier. So far, however, the requirement of the 

15 sialic acid for virulence of S. suis remains unclear. 

"Orf2U" and "Orf2V" showed homology to proteins located on 
two different insertional elements. "Orf2U" is homologous to 
IS1194 of Streptococcus thermophilus, whereas "Orf2V" showed 
homology to a putative transposase of Streptococcus 

20 pneumoniae. This putative transposase was recently found to be 
associated with the type 2 capsular locus of S. pneumoniae. 
Compared with the original insertional elements in S. 
thermophilus and S. pneumoniae, both "Orf2U" and "Orf2V" are 
likely to be non-functional due to frame shift mutations 

25 within their coding regions. 

A striking observation was the presence of a sequence of 
100 bp (Fig. 9) which was repeated three times within the cps2 
operon. The sequence is highly conserved (between 94% and 98% 
) and was found in the intergenic regions between cps2G and 

30 cps2H, within "cps2M" and between cps20 and cps2P. No 

significant homologies were found between this 100-bp direct 
repeat sequence and sequences present in the data libraries, 
suggesting that the sequence is unique for S. suis. 

Distribution of the cps2 sequences among the 35 S. suis 

35 serotypes. To examine the presence of sialic acid encoding 
genes in other S. suis serotypes, we performed cross- 
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hybridization experiments. DNA fragments of the individual 
cps2 genes were amplified by PGR, radiolabelled with 32P and 
hybridized to chromosomal DNA of the reference strains of the 
35 different S. suis serotypes. As a positive control we used 
5 a probe* specific for S. suis 16S rRNA. The 16S rRNA probe 
hybridized with almost equal intensities to all serotypes 
tested (Table 4) . The "cps2L" sequence hybridized with DNA of 
serotype 1, 2, 14 and 1/2. The "cps2M" , cps20, cps2P, cps2Q, 
cps2R, cps2S and cps2T genes hybridized with DNA of serotype 

10 1, 2, 14, 27 and 1/2. Because the cps2P-cps2T genes are most 
probably involved in the synthesis of sialic acid these 
results suggest that sialic acid is also a part of the capsule 
in the S. suis serotype 1, 2, 14, 27 and 1/2. This is in 
agreement with the finding that the serotypes 1, 2 and 1/2 

15 possess a capsule that is rich in sialic acid. Although the 
chemical compositions of the capsules of serotype 14 and 27 
are unknown, recent agglutination studies using sialic acid- 
binding lectins suggested the presence of sialic acid in S. 
suis serotype 14, but not in serotype 27. In these studies, 

20 sialic acid was also detected in serotypes 15 and 16. Since 
the latter observation is not in agreement with our 
hybridization studies, it might be that other genes, not 
homologous to the cps2P-cps2T genes, are responsible for the 
sialic acid synthesis in serotypes 15 and 16. 

25 A probe based on"cps2N" sequences hybridized with DNA from 

serotypes 1, 2, 14 and 1/2. A probe specific for "orf2U" 
hybridized with serotypes 1, 2, 7, 14, 24, 27, 32, 34, and 
1/2, whereas a probe specific for "orf2V" hybridized with many 
different serotypes. In addition, we prepared a probe specific 

30 for the 100-bp direct repeat sequence. This probe hybridized 
with the serotypes 1, 2, 13, 14, 22, 24, 27, 29, 32, 34 and 
1/2 (Table 4). To analyze the number of copies of the direct 
repeat sequence within the S. suis serotype 2 chromosome, a 
Southern blot hybridization and analysis was performed. 

35 Therefore, "chromosomal DNA of S. suis serotype 2 was digested 
with Ncol and hybridized with a 32P-labelled direct repeat 
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sequence. Only one hybridizing fragment, containing the three 
direct repeats present on the cps2 locus, was found (results 
not shown) . This indicates that the 100-bp direct repeat 
sequence is only associated with the cps2 locus. In S. 
5 pneumoniae a 115-bp long repeated sequence was found to be 
associated with the capsular genes of serotypes 1, 3, 14 and 
19F. In S. pneumoniae this 115-bp sequence was also found in 
the vicinity of other genes involved in pneumococcal virulence 
(hyaluronidase and neuraminidase genes) . A regulatory role of 

10 the 115-bp sequence in co-ordinate control of these virulence- 
related genes was suggested. 

To study the role of the capsule in resistance to 
phagocytosis and in virulence, we constructed two isogenic 
mutants in which capsule synthesis was disturbed. In lOcpsB, 

15 the cps2B gene was disturbed by the insertion of an 

antibiotic-resistance gene, whereas in lOcpsEF parts of the 
cps2E and cps2F genes were replaced. Both mutant strains 
seemed to be completely unencapsulated. Because the cps 2 
genes seemed to be part of an operon polar effects cannot be 

20 excluded. Therefore these data did not give any information 
about the role of Cps2B, Cps2E or Cps2F in the polysaccharide 
biosynthesis. However, the results clearly show that the 
capsular polysaccharide of S. suis type 2 is a surface 
component with antiphagocytic activity. In vitro wild type 

25 encapsulated bacteria are ingested by phagocytes at a very low 
frequency, whereas the mutant unencapsulated bacteria are 
efficiently ingested by porcine macrophages. Within 2 hours, 
over 99.6% of mutant bacteria were ingested and over 92% of 
the ingested bacteria were killed. Intracellularly, wild type 

30 as well as mutant strains seemed to be killed with the same 
efficiency. This suggests that the loss of capsular material 
is associated with loss of capacity to resist uptake by 
macrophages. This loss of resistance to in vitro phagocytosis 
was associated with a substantial attenuation of the virulence 

35 in germfree pigs . All pigs inoculated with the- mutant strains 
survived the experiment and did not show any specific clinical 
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signs of disease. Only some aspecific clinical signs of 
disease could be observed. Moreover, mutant bacteria could be 
reisolated from the pigs. This supports the idea that, as in 
other pathogenic Streptococci, the capsule of S. suis acts as 
5 an important virulence factor. Transposon mutants prepared by 
Charland impaired in the capsule production showed a reduced 
virulence in pigs and mice. To construct these mutants the 
type 2 reference strain S735 was used. We previously showed 
that this strain is only weakly virulent for young pigs. 
10 Moreover, the insertion site of the transposon is unsolved 
sofar. 

As a further example herein a rapid PCT test for Streptococcus 
suis type 1 is described. 

15 

Recent epidemiological studies on Streptococcus suis 
infections in pigs indicated that, besides serotypes 1, 2 and 
9, serotype 7 is also frequently associated with diseased 
animals. For the latter serotype, however, no rapid and 

20 sensitive diagnostic methods are available. This hampers 
prevention and control programs. Here we describe the 
development of a type-specific PCR test for the rapid and 
sensitive detection of S. suis serotype 7. The test is based 
on DNA sequences of capsular (cps) genes specific for serotype 

25 7. These sequences could be identified by cross-hybridization 
of several individual cps genes with the chromosomal DNAs of 
35 different S. suis serotypes. 

Streptococcus suis is an important cause of meningitis, 
septicemia, arthritis and sudden death in young pigs [69,70]. 

30 It can, however, also cause meningitis in man [71] . Attempts 
to control the disease are still hampered by the lack of 
sufficient knowledge about the epidemiology of the disease and 
the lack of effective vaccines and sensitive diagnostics. 
S. suis strains can be identified and classified by their 

35 morphological, biochemical and serological characteristics 
[70, 73, 74], Serological classification is based on the 
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presence of specific antigenic determinants. Isolated and 
biochemically characterized S. suis cells are agglutinated 
with a panel of specific sera. These typing methods are very 
laborious and time-consuming and can only be performed on 
5 isolated colonies. Moreover, it has been reported that 

nonspecific cross-reactions may occur among different types of 
S. suis [75 f 76] . 

So far, 35 different serotypes have been described [7 , 78, 
79]. S. suis serotype 2 is the most prevalent type isolated 

10 from diseased pigs, followed by serotypes 9, and 1. However, 
recently serotype 7 strains were also frequently isolated from 
diseased pigs [80, 81, 82] . This suggests that infections 
with S. suis serotype 7 strains seemed to be an increasing 
problem. Moreover, the virulence of S. suis serotype 7 strains 

15 was confirmed by experimental infection of young pigs [83] . 
Recently, rapid and sensitive PGR assays specific for 
serotypes 2 (and 1/2), 1 (and 14) and 9 were developed [84]. 
These assays were based the cps loci of S. suis serotypes 2, 
1 and 9 [84, 85]. However, until now no rapid and sensitive 

20 diagnostic test is available for S. suis serotype 7. Herein we 
describe the development of a PCR test for the rapid and 
sensitive detection of S. suis serotype 7 strains. The test is 
based on DNA sequences which form a part of the cps locus of 
S. suis serotype 7. Compared with the serological serotyping 

25 methods the PCR assay was a rapid, reliable and sensitive 

assay. Therefore, this test, in combination with the PCR tests 
which we previously developed for serotype 1, 2 and 9, will 
undoubtedly contribute to a more rapid and reliable diagnosis 
of S. suis and may facilitate control and eradication 

30 programs. 
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Bacterial strains, growth conditions and serotyping. 

The bacterial strains and plasmids used in this study are 
5 listed in Table 7. The S. suis reference strains were obtained 
from M. Gottschalk, Canada. S. suis strains were grown in 
Todd-Hewitt broth (code CM189, Oxoid) , and plated on Columbia 
agar blood base (code CM331, Oxoid) containing 6% (v/v) horse 
blood. E.coli strains were grown in Luria broth [86] and 
10 plated on Luria broth containing 1.5% (w/v) agar. If required, 
arnpicillin was added to the plates. The S. suis strains were 
serotyped by the slide agglutination test with serotype- 
specific antibodies [70] , 

15 DNA techniques. 

Routine DNA manipulations and PCR reactions were performed 
as described by Sambrook et al. [88]. Blotting and 
hybridization was performed as described previously [84,86], 

20 DNA sequence analysis. 

DNA sequences were determined on a 373A DNA Sequencing 
System (Applied Biosystems, Warrington, GB) . Samples were 
prepared by use of a ABI/PRISM dye terminator cycle sequencing 
ready reaction kit (Applied Biosystems) . Custom-made 

25 sequencing primers were purchased from Life Technologies. 
Sequencing data were assembled and analyzed using the 
McMollyTetra program. The BLAST program was used to search for 
protein sequences homologous to the deduced amino acid 
sequences . 

30 

PCR. 

The primers used for the cps7H PCR correspond to the 
positions 3334-3354 and 3585-3565 in the S. suis cps7 locus. 
The sequences were: 
35 5 ' -AGCTCTAACACGAAATAAGGC-3 ' and 5 1 ~GTCAAACACCCTGGATAGCCG-3 1 . . 
The reaction mixtures contained 10 mM Tris-HCl, pH 8.3; 1.5 mM 
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MgC12; 50 mM KC1; 0.2 mM of each of the four deoxynucleotide 
triphosphates; 1 microM of each of the primers and 1U of 
AmpliTaq Gold DNA polymerase (Perkin Elmer Applied Biosystems, 
New Jersey) . DNA amplification was carried out in a Perkin 
5 Elmer 9600 thermal cycler and the program consisted of an 

incubation for 10 min at 95oC and 30 cycles of 1 min at 95oC, 
2 min at 56oC and 2 min at 72oC. 

Results and discussion 

10 

Cloning of the seroytpe 7-sp^cific cps genes. 

To isolate the type-specific cps genes of S. suis serotype 
7 we used the cps9E gene of serotype 9 as a probe to identify 
chromosomal DNA fragments of type 7 containing homologous DNA 
15 sequences [84]. A 1.6-kb PstI fragment was identified and 
cloned in pKUN19. This yielded pCPS7-l (Fig. 11C) . In turn, 
this fragment was used as a probe to identify an overlapping 
2.7 kb Scal-Clal fragment. pGEM7 containing the latter 
fragment was designated pCPS7-2 (Fig. 11C) . 

20 

Analysis of the cloned cps7 genes. 

The complete nucleotide sequences of the inserts of pCPS7- 
1, pCPS7-2 were determined. Examination of the cps7 sequence 
revealed the presence of two complete and two incomplete open 

25 reading frames (ORFs) (Fig.llC). All ORFs are preceded by a 
ribosome-binding site. In accord with the data obtained for 
the cpsl, cps2 and cps9 genes of serotypes 1, 2 and 9, 
respectively, the type 7 ORFs are very closely linked to each 
other. The only significant intergenic gap was that found 

30 between cps7E and cps7F (443 nucleotides) . No obvious promoter 
sequences or potential stem-loop structures were found in this 
region. This suggests that, as in serotype 1, 2 and 9, the cps 
genes in serotype 7 form part of an operon. 

An overview of the ORFs and their properties is shown in 

35 Table 8.. As expected on the basis of the hybridization . data 
[84], the Cps9E and Cps7E proteins showed a high similarity 
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(identity 99%, Table 8) . Based on sequence comparisons between 
Cps9E and Cps7E, the PstI fragment of pCPS7-l lacks the region 
encoding the first 371 codons of Cps7E. The C-terminal part of 
the protein encoded by the cps7F gene showed some similarity 
5 with the BplG protein of Bordetella pertussis [88], as well 
as with the C-terminal part of S. suis Cps2E [85]. Both BplG 
and Cps2E were suggested to have glycosyltransf erase activity 
and are probably involved in the linkage of the first sugar to 
the lipid carrier [85,88]. The protein encoded by the cps7G 

10 gene showed similarity with the BlpF protein of Bordetella 
pertussis [88]. BplF is likely to be involved in the 
biosynthesis of an amino sugar, suggesting a similar function 
for Cps7G. The protein encoded by the cps7H gene showed 
similarity with the WbdN protein of E. coli [89] as well as 

15 with the N-terminal part of the Cps2K protein of S. suis [81]. 
Both WbdN and Cps2K were suggested to have glycosyltransf erase 
activity [85, 89] . 

Serotype 7 specific cps genes. 

20 To determine whether the cloned fragments in pCPS7-l and 

pCPS7-2 contained serotype 7-specific DNA sequences, cross 
hybridization experiments were performed. DNA fragments of the 
individual cps7 genes were amplified by PCR, labelled with 
32P, and used to probe spot blots of chromosomal DNA of the 

25 reference strains of 35 different S. suis serotypes. The 

results are summarized in Table 9. As expected, based on the 
data obtained with the cps9E probe [84], the cps7E probe 
hybridized with chromosomal DNA of many different S. suis 
serotypes. The cps7F and cps7G probes showed hybridization 

30 with chromosomal DNA of S. suis serotypes 4, 5, 7, 17, and 23. 
However, the cps7H probe hybridized with chromosomal DNA of 
serotype 7 only, indicating that this gene is specific for 
serotype 7. 
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Type specific PCR. 

We tested whether we could use PCR instead of hybridization 
for the typing of the S. suis serotype 7 strains. For that 
purpose we selected an oligonucleotide primer set within the 
5 cps7H gene with which an amplified fragment of 251-bp was 
expected. In addition, we included in our analysis several S. 
suis serotype 7 strains, other than the reference strain. 
These strains were obtained from different countries and were 
isolated from different organs (Table 7) . The results show 

10 that indeed a fragment of about 250-bp was amplified with all 
type 7 strains used (Fig. 12B) , whereas no PCR products were 
obtained with serotype 1, 2 and 9 strains (Fig. 12A) . This 
suggests that the PCR test, as described here, is a rapid 
diagnostic tool for the identification of S. suis serotype 7 

15 strains. Until now such a diagnostic test was not available 
for serotype 7 strains. Together with the recently developed 
PCR assays for serotype 1, 2, 1/2, 14 and 9, this assay may be 
an important diagnostic tool to detect pigs carrying serotype 
2, 1/2, 1, 14 ,9 and 7 strains and may facilitate control and 

20 eradication programs. 
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strain/plasmid relevant source/reference 

characteristics 



Strain 

E.coli 
CC118 
XL2 blue 



PhoA" 

Stratagene 



(28) 



E.coli 
XL2 blue 



Stratagene 



S. suis 

10 

3 

17 

735 

T15 

6555 
6388 
6290 
5637 



virulent serotype 2 strain 
serotype 2 
serotype 2 

reference strain serotype 2 
serotype 2 

reference strain serotype 1 
serotype 1 
serotype 1 
serotype 1 



(49) 
(63) 
(63) 
(63) 
(63) 

(63) 
(63) 
(63) 
(63) 



5673 
5679 
5928 
5934 
5209 

5218 
5973 
6437 
6207 



serotype 1/2 
serotype 1/2 
serotype 1/2 
serotype 1/2 

reference strains serotype 1/2 

reference strain serotype 9 
serotype 9 
serotype 9 
serotype 9 



(63) 
(63) 
(63) 
(63) 
(63) 

(63) 
(63) 
(63) 
(63) 



reference strains 



serotypes 1-34 



(9, 56, 14) 



S. suis 
10 

lOcpsB 



virulent serotype 2 strain 
isogenic cpsB mutant of strain 10 



(51) 

this work 



lOcpsEF 



isogenic cpsEF mutant of strain 10 



this work 



Plasmid 

PKUN19 
pGEM7Zf (+) 
pIC19R 
pIC20R 
pIC-spc 



replication functions pUC, Amp 
replication functions pUC, Amp B 
replication functions pUC, Amp* 
replication functions pUC, Amp R 
pIC19R containing spc* gene of pDL282 



(23) 
Promega Corp. 
(2 9) 
(29) 

labcollection 



WO 00/05378 

pDL282 

pPH0S2 

pPH07 

pPHOS7 

pCPS6 

pCPS7 

pCPSll 

pCPS17 

pCPS18 

pCPS20 

pCPS23 

pCPS25 

pCPS26 

pCPS27 

pCPS28 

pCPS29 

pCPSl-1 

pCPSl-2 

pCPS9-l 

pCPS9-2 
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replication functions of pBR322 and 
pVT736-l, Amp*, Spc R 

pIC-spc containing the truncated pnoA gene 
of pPH07 as a Pstl-BamHI fragment 
contains truncated phok gene 
pPHOS2 containing chromosomal S. suis DNA 
pKUNl9 containing 6 kb tfindlll fragment 
of cps ope r on 

pKUN19 containing 3,5 kb EcoRI-Hindlll fragment 
of cps operon 

pCPS7 in which 0.4 kb Psti-fiamHI fragment 
of cpsB gene is replaced by Spc R gene of pIC-spc 
pKUN19 containing 3.1 kb Kpnl fragment 
of cps operon 

pKUN19 containing 1.8 kb SnaBI fragment 
of cps operon 

PKUN19 containing 3.3 kb Xbal-Hindlll 
fragment of cps operon 

pGEM7Zf(+) containing 1 . 5 kb Mlul fragment 
of cps operon 

pIC20R containing 2.5 kb Kpnl-Sall fragment 
of pCPS17 

pKUN19 containing 3.0 kb Hindi I I fragment 
of cps operon 

pCPS25 containing 2.3 kb Xbal (blunt) -Clal 
fragment of pCPS20 

pCPS27 containing the 1.2 kb Pstl-Xhol Spc R 
gene of pIC-spc 

pKUN19 containing 2.2 kb Sacl-Ps tl fragment 
of cps operon 

pKUN19 containing 5 kb EcoRV fragment 
of cps operon of type 1 

pKUN19 containing 2.2 kb Hindlll fragment 
of cps operon of type 1 
pKUN19 containing 1 kb Hindi I I -Xbal 
fragment of cps operon of serotype 9 
PKUN19 containing 4.0 kb Xbal-Xbal 
fragment of cps operon of serotype 9 



(43) 

this work 
(15) 

this work 

this work (Fig.l) 

this work (Fig.l) 

this work (Fig.l) 

this work (Fig.l) 

this work (Fig.l) 

this work (Fig.l) 

this work (Fig.l) 

this work (Fig.l) 

this work (Fig.l) 

this work (Fig.l) 

this work (Fig.l) 

this work (Fig.l) 

this work (Fig.l) 

this work (Fig.l) 

this work (Fig.l) 

this work (Fig.l) 



Amp R : ampicillin resistant 
Spc R : spectinomycin resistant 
cps: capsular polysaccharide 
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LEGENDS TO FIGURES 

Figure 1. 

Organization of the cps2 gene cluster of S. suis type 2. 

(A) Genetic map of the cps2 gene cluster. The shadowed arrows 
represent potential ORFs. Interrupted ORFs indicate the 
presence of stop codons or frame-shift mutations. Gene 
designations are indicated below the ORFs. The closed arrows 
indicate the position of the potential promoter sequences. I 
indicates the position of the potential transcription 
regulator sequence. I I I indicates the position of the 100-bp 
repeated sequence. 

(B) Physical map of the cps2 locus. 

Restriction sites are as follows: A: Alul; C: Clal; E, EcoRI; 
H, Hindll I; K, Kpnl; M, Mlul; N, Nsil; P, PstI; S, SnaBI; Sa: 
Sad; X, Xbal. 

(C) The DNA fragments cloned in the various plasmids. 
Figure 2 

20 Ethidium bromide stained agarose gel showing PCR products 

obtained with chromosomal DNA of S.suis strains belonging to 
the serotypes 1,2, H, 9 and 14 and cps2J, cpsll and cps9H 
primer sets as described in Materials and Methods. (A) cpsll 
primers . 

25 (B) cps2J primers and (C) cps9H primers. Lanes 1-3: serotype 1 
strains; lanes 4-6: serotype 2 strains; lanes 7-9: serotype H 
strains; lanes 10-12: serotype 9 strains and lanes 13-15: 
serotype 14 strains. 

(B) Ethidium bromide stained agarose gel showing PCR products 
30 obtained with tonsillar swabs collected from pigs carrying 
S.suis type 2, type 1 or type 9 strains and cps2j, cpsll and 
cpsH primer sets as described in Materials and Methods. 
Bacterial DNA suitable for PCR was prepared by using the 
multiscreen methods as described previously (20) . (A) cpsll 
35 primers. (B) cps2J primers and (C) cps9H primers. Lanes 1-3: 
PCR products obtained with tonsillar swabs collected from pigs 



10 
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carrying S.suis type 1 strains; lanes 4-6: PCR products 

obtained with tonsillar swabs collected from pigs carrying 
S.suis type 2 strains; lanes 7-9: PCR procucts obtained with 
tonsillar swabs collected from pigs carrying S.suis type 9 
strains; lanes 10-12: PCR products obtained with chromosomal 
DNA from serotype 9, 2 and 1 strains respectively; lane 13: 
negative control, no DNA present. 



Figure 3 

10 CPS2 nucleotide sequences and corresponding amino acid 
sequences from the open reading frames. 

Figure 4 

CPS1 nucleotide sequences and corresponding amino acid 
15 sequences from the open reading frames. 

Figure 5 

CPS9 nucleotide sequences and corresponding amino acid 
sequences from the open reading frames. 

20 



Figure 6 

CPS7 nucleotide sequences and corresponding amino acid 
sequences from the open reading frames. 

25 

Figure 7 

Alignments of the N-terminal parts of Cps2J and Cps2K. 
Identical amino acids are marked by bars. The amino acids 
shown in bold are also conserved in Cpsl4I, Cpsl4J of S. 
30 pneumoniae and several other glycosyltransf erases (19) . The 
aspartate residues marked by asterics are strongly conserved. 



Figure 8 

. Transmission electron micrographs of thin sections of various 
35 5. suis strains. 

(A) wild type strain 10; 
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(B) mutant strain lOcpsB; 

(C) mutant strain lOcpsEF. 
Bar = 100 nm 

5 Figure 9 

(A) Kinetics of phagocytosis of wild type and mutant S. suis 
strains by porcine alveolair macrophages. Phagocytosis was 
determined as described in Materials and Methods. The Y-axis 
represents the number of CFU per milliliter in the supernatant 

10 fluids as determined by plate counting, the X-axis represents 
time in minutes. 

□ wild type strain 10; 
o mutant strain lOcpsB; 
A mutant strain lOcpsEF. 

15 

(B) Kinetics of intracellular killing of wild type and mutant 
S. suis strains by porcine AM. The intracellular killing was 
determined as described in Material and Methods. The Y-axis 
represents the number of CFU per ml in the supernatant fluids 

20 after lysis of the macrophages as determined by plate 
counting, the X-axis represents time in minutes. 

□ wild type strain 10; 
o mutant strain lOcpsB; 
A mutant strain lOcpsEF. 

25 

Figure 10 

Nucleotide sequence alignment of the highly conserved 100-bp 
repeated element. 

1) 100-bp repeat between cps2G and cps2H 
30 2) 100-bp repeat within "cps2M" 

3) 100-bp repeat between cps20 and cps2P 

Figure 11. The cps2, cps9 and cps7 gene clusters of S. suis 
35 serotypes 2, 9 and 7. 
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(A) Genetic organization of the cps2 gene cluster [84]. The 
large arrows represent potential ORFs. Gene designations are 
indicated below the ORFs. Identically filled arrows represent 
ORFs which showed homology. The small closed arrows indicate 

5 the position of the potential promoter sequences. I indicates 
the position of the potential transcription regulator 
sequence . 

(B) Physical map and genetic organization of the cps9 gene 
cluster [15]. Restriction sites are as follows: B: BamHI; P: 

10 PstI; H: Hindlll; X:XbaI. The DNA fragments cloned in the 
. various plasmids are indicated. The open arrows represent 
potential ORFs. 

(C) Physical map and genetic organization of the cps7gene 
cluster. Restriction sites are as follows: C: Clal; P: PstI; 

15 Sc: Seal. The DNA fragments cloned in the various plasmids are 
indicated. The open arrows represent potential ORFs. 

Figure 12 (A) Ethidium bromide stained agarose gel showing PCR 
products obtained with chromosomal DNA of S. suis strains 
20 belonging to the serotypes 1, 2, 9 and 7 and the cps7H primer 
set. Strain designations are indicated above the lanes. C: 
negative control, no DNA present. M: molecular size marker 
(lambda digested with EcoRI and Hindlll) . 

(B) Ethidium bromide stained agarose gel showing PCR products 
25 obtained with serotype 7 strains collected in different 

countries and from different organs. Bacterial DNA suitable, 
for PCR was prepared by using the multiscreen method as 
described previously [89] . Strain designations are indicated 
above the lanes. M: molecular size marker (lambda digested 
30 with EcoRI and Hindlll) . 
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CLAIMS 

1. An isolated or recombinant nucleic acid encoding a capsular 
gene cluster of Streptococcus suis or a gene or gene fragment 
derived thereof. 

2. A nucleic acid according to claim 1 encoding a 

5 Streptococcus suis serotype-specif ic central region, 

preferably encoding at least one enzyme or fragment thereof 
involved in polysaccharide biosynthesis. 
- 3. A nucleic acid according to claim -1 or 2 hybridising • to a 
nucleic acid encoding a gene derived from a Streptococcus suis 
10 serotype 1, 2 or 9 capsular gene cluster. 

4 . An isolated or recombinant nucleic acid encoding a capsular 
gene cluster of Streptococcus suis serotype 2 or a gene or 
gene fragment derived thereof, preferably as identified in 
Figure 3. 

15 5. An isolated or recombinant nucleic acid encoding a capsular 
gene cluster of Streptococcus suis serotype 1 or a gene or 
gene fragment derived thereof, preferably as identified in 
Figure 4 . 

6. An isolated or recombinant nucleic acid encoding a capsular 
20 gene cluster of Streptococcus suis serotype 9 or a gene or 

gene fragment derived thereof, preferably as identified in 
Figure 5. 

7. A nucleic acid probe or primer derived from a nucleic acid 
according to anyone of claims 1 to 6 allowing species or 

25 serotype specific detection of Streptococcus suis. 

8 . A probe or primer according to claim 7 provided with at 
least one reporter molecule. 

9. A diagnostic test comprising a probe or primer according to 
claim 7 or 8. 

30 10. A protein or fragment thereof encoded by a nucleic acid 
according to anyone of claims 1 to 6. 

11. A protein or fragment according to claim 10 capable of 
polysaccharide biosynthesis. 
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12. A method to produce a Streptococcus suis capsular antigen 
comprising using a protein or fragment according to claim 11. 

13. A Streptococcus suis capsular antigen obtainable by a 
method according to claim 12. 

5 14. A vaccine comprising an antigen according to claim 13 and 
further comprising a suitable carrier or adjuvant. 

15. A recombinant Streptococcus suis mutant provided with a 
modified capsular gene cluster. 

16. A recombinant micro-organism comprising at least a part of 
10 a capsular gene cluster of Streptococcus suis. 

17. A recombinant micro-organism according to claim ,16 
comprising a lactic acid bacterium. 

18. A vaccine comprising a mutant according to claim 15 or a 
micro-organism according to claim 16 or 17. 

15 19. A vaccine according to claim 18 comprising a Streptococcus 
mutant deficient in capsular expression. 

20. A vaccine according to claim 19 wherein said Streptococcus 

mutant has been derived by recombinant techniques, preferably 

through homologous recombination. 
20 21. A vaccine according to claim 19 or 20 wherein said mutant 

is capable of surviving in an immune-competent host. 

22. A vaccine according to claim 21 wherein said mutant is 

capable of surviving at least 4-5 days, preferably at least 8- 

10 days, in said host. 
25 23. A vaccine according to any of claims 19 to 22 comprising a 

mutant capable of expressing a Streptococcus virulence factor 

or antigenic determinant. 

24. A vaccine according to any of claims 19 to 23 comprising a 
mutant capable of expressing a non- Streptococcus protein. 
30 25. A vaccine according to claim 24 wherein said non- 
Streptococcus protein has been derived from a pathogen. 

26. A method for controlling or eradicating a Streptococcal 
disease in a population comprising vaccinating subjects in 
said population with a vaccine according to anyone of claims 

35 18 to 25. 

27. A method for controlling or eradicating a Streptococcal 
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disease comprising testing a sample collected from at least 

one subject in a population partly or wholly vaccinated with a 
vaccine according to anyone of claims 19 to 25 for the 
presence of encapsulated Streptococcal strains. 

5 28. A method for controlling or eradicating a Streptococcal 
disease comprising testing a sample collected from at least 
one subject in a population partly or wholly vaccinated with a 
vaccine according to anyone of claims 19 to 25 for the 
presence of capsule-specific antibodies directed against 

10 Streptococcal strains. 

29. A method for controlling or eradicating a Streptococcal 
disease in a population comprising selecting subjects in said 
population vaccinated with a vaccine according to anyone of 
claims 19 to 25 and testing a sample collected from at least 

15 one subject in said population for the presence of 

encapsulated Streptococcal strains and/or for the presence of 
capsule-specific antibodies directed against Streptococcal 
strains . 
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AAGCTTGGAT ATTGATCACA TGATGGAGGT GATGGAAGCA TCTAAGTCTG CAGCGGGGTC 
GGCGTGCCCA AGTCCGCAGG CTTATCAGGC AGCTTTTGAG GGAGCTGAGA 
ACATTATCGT TGTGACGATT ACAGGTGGGC TATCGGGTAG TTTTAATGCG GCACGTGTAG 
CTAGGGATAT GTATATCGAA GAGCATCCGA ATGTCAATAT CCATTTGATA 
GATAGTTTGT CAGCCAGTGG GGAAATGGAT TTACTTGTAC ACCAAATCAA TCGCTTAATT 
AGTGCAGGAT TAGATTTTCC ACAAGTAGTA GAAGCGATAA CTCACTATCG 
GGAACACAGT AAGCTCCTCT TTGTTTTAGC GAAAGTTGAT AATCTTGTTA AGAATGGAAG 
ACTGAGCAAA TTGGTAGGCA CTGTCGTTGG TCTTCTCAAT ATCCGTATGG 
TTGGTGAGGC AAGTGCTGAA GGAAAATTAG AGTTGCTTCA AAAGGCGCGT GGTCATAAGA 
AATCTGTGAC AGCAGCCTTT GAAGAAATGA AAAAAGCAGG CTATGATGGT 
GGTCGAATTG TTATGGCCCA CCGCAACAAT GCTAAGTTCT TCCAACAATT CTCAGAGTTG 
GTAAAAGCAA GTTTTCCAAC GGCTGTTATT GACGAAGTTG CAACATCAGG 
TCTATGCAGT TTTTATGCTG AAGAAGGTGG ACTTTTGATG GGCTACGAAG TGAAAGCGTG 
ATTCACAGAG TAATAATTTT GGGCTGTAAT TTCCGCTATA GAATAATCCC 
CCTCTTCTTC TAAGTTCGAG GGGGATTGTT TGTATGAGAC TATTGGATTT CATTCATTCA 
AATATCTTAC GAATTGCTCC AGTTTATCTG CAAAATCTTG TTCAAAGAAG 
ATCTGTAAGA AATCAGCTTT CTGTCCGCTG AAATAATAAC ATTTTCCAAA CATGTGTTGG 
ATGCTAGGAG AAAGAATCCC CTTGCTTAGC TGAAAGGTCA CGCTCCCCTT 
TGGAATTCGA TACGGGATGT TTAAAGCGTA TTTCTCTAGA CAGTCTTTTA TTTTATTCCA 
TTGAGCGTGA TAAATGTGAT GAAGATGCTG TGTGTTCCGC GCAAACATAC 
CGTTATCAAT GTAGAGCGAG AGAGCTTTTT GCATGATAAG ATTGGTATCG TAGTCGATTA 
GACTCTTATG TTTGATGAAG ATATCACGTA GCTGATTAGG AAGGCTGATT 
GCACCGATTC GGAGGGCAGG AAAGAGTGTC GGTGTAAAAG ATTTTATATA GATGACGCGA 
TTATCTGTAT CAAGATAGTG TAAAGGTAGG CTATGACTAG AGTCGAAATC 
TGCTAAATAG TCATCCTCAA TGATGTAGAC ATCGTATTGC TTTGCTAATT TTACGATGGC 
TGTTTTTGTT GCTATATCAT AGGTTGAACC GAGAGGGTTG TGCAAGCGAG 
GAATTGTGTA GAAAAACTTA ATTTTTCCAG TTTGGAAGAT ACTTTCCAAT TCTTCTAGGT 
CAATTCCATC TAAATTCCGT TCAATTGTTT GATAGGGGAT TCCTTGATGT 
CGAATGAGCT CTATCATTCG TGAATAGGTA GGGTTCTCTA TCAAGATTTC CGTTTTTCCA 
GCCAAGGTTT CCATTTGTGT GAGAATATAT AGAGCTTGTT GACTACCAGC 
TGTGATAACC AGCTGGTCTT TTTTTGTATA GACATGATAG TCCATTAACA GACTTTGAAC 
GGAGGAAATC AATTCTGCCA ATCCCTCTTG CTGGTGATAG TAGTTGAATA 
GGTAATTTTC CCGCCCAATA AGACTTTCTT TTAGACAAAT CCGAAAATCT TCATAGGTAA 
TTCTTGAAAG TCTGTAGGAT TGAGCTCTAC AGGTATGGTC TTGGAAATCT 
CTATCCTCTA AGATATAATA ACCGCTTTTT TCGACAGCGT AGATCTTATT TTGGTATTTT 
AATTCCAACA TAGCCTTTTG GACAGTGTCT TTGCTACAAT GATATTGCTC 
GCGGAGTTGA CGGATAGAAG GTAATTTCTC TCCACGTTTG AATCGATGTT CCTCTATTCC 
AGTCAAAATA TCTTGGATGA TAACTTGATA TTTTTTCATC TAGGTCCCCT 
TTTTTATAGA CTATGTTACT AGCTAGTATA TAGAAAAAAT TGAAGAAAGA CAATATATGA 
ATAATGGGGT TGAGGTTCAG GAATTAAGCT ACTCTATGGT ATAATTAAGT 
GATGAAAATA ATTATACCTA ATGCAAAAGA AGTAAATACA AATCTAGAGA ATGCCTCGTT 
TTATCTCCTG TCTGATCGAA GCAAGCCGGT GCTGGATGCC ATAAGTCAAT 
TTGATGTAAA AAAGATGGCT GCCTTTTATA AATTGAATGA AGCAAAGGCT GAGTTAGAAG 
CTGACCGTTG GTATCGAATC AGGACAGGTC AAGCAAAAAC CTATCCAGCC 
TGGCAGTTAT ATGATGGTCT CATGTATCGT TATATGGATA GGCGAGGTAT AGATTCGAAA 
GAAGAAAATT ATTTACGTGA CCACGTTCGT GTAGCGACAG CCTTATACGG 
ATTGATTCAT CCTTTTGAAT TCATTTCACC TCACCGCTTA GATTTTCAAG GGAGCTTAAA 
GATAGGCAAT CAGTCTTTGA AACAGTACTG GCGACCGTAT TATGACCAAG 
AAGTTGGTGA TGATGAACTG ATTCTCTCAC TGGCTTCGTC AGAATTTGAG CAGGTGTTTT 
CTCCCCAGAT TCAGAAAAGA TTAGTTAAAA TTCTTTTCAT GGAAGAAAAA 
GCAGGTCAGC TAAAAGTTCA CTCGACTATA TCAAAAAAAG GCAGAGGAAG ATTGCTGTCC 
TGGTTGGCTA AGAACAATAT TCAGGAATTA TCGGACATTC AAGATTTTAA 
GGTGGATGGC TTTGAATATT GTACTTCCGA ATCAACGGCA AACCAACTTA CCTTCATACG 
ATCAATAAAA ATGTGAAATT ATGAAAAAGA TAACGTTTTC CAGCGCTAAA 
AAGGGTAGAA AAATATTAAT TTCTATGATA TAATGGATGC GTTATAGGTA AAAGTCTAGG 
AAGGTTGTTT ATGAAAAAGA GAAGCGGACG AAGTAAGTCG TCCAAGTTCA 
AATTGGTAAA TTTTGCGCTT TTGGGACTTT ATTCCATTAC TCTATGTTTG TTCTTAGTGA 
CCATGTATCG CTATAACATC CTAGATTTCC GGTATTTAAA CTATATTGTG 
ACGCTTTTGC TAGTAGGAGT GGCAGTATTG GCTGGATTAT TGATG TGGCG TAAGAAAGCG 
CGCATATTTA CAGCGCTCTT ACTTGTTTTT TCACTGGTCA TCACGTCTGT ' 
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TGGGATCTAT GGAATGCAAG AAGTTGTAAA ATTTTCAACA CGACTAAATT CAAATTCGAC 
ATTTTCAGAA TATGAAATGA GTATCCTTGT CCCAGCAAAT AGTGATATTA 
CGGACGTTCG TCAGCTTACT AGTATCCTTG CTCCAGCCGA ATACGACCAA GATAACATCA 
CCGCTTTATT GGATGACATA TCCAAAATGG AATCTACTCA ACTAGCAACT 
AGCCCCGGGA CTTCTTACCT GACAGCATAT CAATCTATGT TGAATGGCGA GAGTCAAGCG 
ATGGTGTTCA ACGGAGTTTT TACCAATATT TTAGAAAATG AAGATCCAGG 
CTTTTCTTCA AAAGTGAAAA AAATATATAG TTTCAAAGTG ACTCAGACTG TTGAAACAGC 
TACTAAGCAG GTGAGTGGAG ATAGCTTTAA TATCTATATT AGTGGTATTG 
ATGCTTATGG ACCGATTTCT ACGGTCTCTC GTTCAGATGT CAATATCATT ATGACTGTCA 
ATCGTGCGAC ACATAAGATT TTATTGACAA CTACTCCACG AGATTCATAC 
GTTGCTTTCG CAGATGGCGG GCAAAATCAA TACGATAAAC TAACACATGC TGGTATTTAC 
GGTGTCAATG CTTCTGTGCA CACCTTAGAA AATTTTTATG GGATTGACAT 
TAGCAATTAT GTGCGGTTGA ACTTCATTTC CTTCCTTCAA TTAATCGACT TGGTGGGTGG 
AATTGATGTA TATAACGATC AAGAATTTAC AAGTTTACAT GGGAATTATC 
ATTTCCCTGT TGGACAAGTT CATTTAAACT CAGACCAAGC ATTAGGCTTC GTTCGAGAGC 
GCTACTCTTT AACAGGGGGT GACAATGACC GTGGTAAAAA CCAGGAAAAA 
GTGATTGCTG CCTTGATTAA AAAGATGAGT ACGCCAGAGA ATCTAAAAAA TTACCAGGCA 
ATCCTATCTG GATTGGAAGG CTCAATTCAA ACGGATTTGA GCTTAGAAAC 
GATTATGAGT TTAGTGAATA CCCAACTAGA ATCAGGAACA CAATTTACAG TAGAGTCACA 
AGCATTGACA GGAACAGGAC GCTCAGACTT ATCTTCTTAT GCGATGCCTG 
GATCACAACT TTATATGATG GAAATTAACC AAGATAGTCT GGAGCAATCA AAGGCAGCGA 
TTCAGTCCGT ACTTGTTGAA AAATAAAGAT TTTAGGAGAA AATATGAACA 
ATCAAGAAGT AAATGCAATC GAAATCGATG TTTTATTCTT ACTAAAAACA ATTTGGAGAA 
AGAAATTTTT AATTCTCTTA ACTGCAGTGT TGACTGCGGG GTTGGCATTT 
GTCTACAGTA GTTTTTTAGT GACACCTCAA TATGACTCCA CTACCCGTAT CTATGTAGTG 
AGTCAAAATG TTGAAGCCGG TGCGGGCTTG ACTAACCAAG AGTTACAAGC 
GGGTACCTAT TTGGCAAAAG ACTATCGGGA AATTATCCTA TCACAAGATG TATTGACACA 
AGTAGCAACG GAATTGAATC TGAAAGAGAG TTTGAAAGAA AAAATATCAG 
TTTCTATTCC TGTTGATACT CGTATCGTTT CTATTTCTGT GCGTGATGCG GATCCAAATG 
AAGCGGCACG TATTGCAAAT AGCCTTCGCA CCTTTGCAGT GCAAAAGGTT 
GTTGAGGTCA CCAAGGTAAG CGATGTGACG ACACTTGAAG AAGCAGTCCC AGCGGAAGAA 
CCAACCACTC CAAATACAAA ACGAAATATC TTGCTTGGTT TATTAGCTGG 
AGGTATCTTG GCAACAGGTC TTGTACTGGT TATGGAGGTT TTGGATGACC GTGTAAAACG 
TCCTCAGGAC ATCGAAGAGG TAATGGGATT GACATTGCTA GGTATAGTAC 
CAGATTCGAA GAAATTAAAA TAGGAGAACA ATATGGCGAT GTTAGAAATT GCACGTACAA 
AAAGAGAGGG AGTAAATAAA ACCGAGGAGT ATTTCAATGC TATCCGTACC 
AATATTCAGC TTAGCGGAGC AGATATTAAG GTTGTTGGTA TTACCTCTGT TAAATCGAAT 
GAAGGTAAGA GTACAACTGC GGCTAGTCTC GCTATTGCCT ATGCTCGTTC 
AGGTTATAAG ACCGTCTTGG TGGATGCAGA TATCCGAAAT TCAGTCATGC CTGGTTTCTT 
CAAGCCAATT ACAAAGATTA CAGGTTTGAC GGATTACCTA GCAGGGACAA 
CAGACTTGTC TCAAGGATTA TGCGATACAG ATATTCCAAA CTTGACCGTA ATTGAGTCAG 
GAAAGGTTTC TCCCAACCCT ACTGCCCTTT TACAAAGTAA GAATTTTGAA 
AATCTACTTG CGACTCTTCG TCGCTATTAT GATTATGTTA TCGTTGACTG TCCACCATTA 
GGACTGGTAA TTGATGCAGC TATCATTGCA CAAAAATGTG ATGCGATGGT 
TGCAGTAGTA GAAGCAGGCA ATGTTAAGTG CTCATCTTTG AAAAAAGTAA AAGAGCAGTT 
GGAACAAACA GGCACACCGT TCTTAGGCGT TATCTTGAAC AAATATGATA 
TTGCCACTGA GAAGTATAGT GAATACGGAA ATTACGGCAA AAAAGCCTAA TTTCTCAGAT 
AACATAAGTT TGATAAGTAG GTATTAATAT GATTGATATC CATTCGCATA 
TCATATTTGG TGTGGATGAC GGTCCCAAAA CTATTGAAGA GAGCCTGAGT TTGATAAGCG 
AAGCTTATCG TCAAGGTGTT CGCTATATCG TAGCGACATC TCATAGACGA 
AAAGGGATGT TTGAAACACC AGAAAAAATC ATCATGATTA ACTTTCTTCA ACTTAAAGAG 
GCAGTAGCAG AAGTTTATCC TGAAATACGA TTGTGCTATG GTGCTGAATT 
GTATTATAGT AAAGATATCT TAAGCAAACT TGAAAAAAAG AAAGTACCAA CACTTAATGG 
CTCGTGCTAT ATTCTCTTGG AGTTCAGTAC GGATACTCCT TGGAAAGAGA 
TTCAAGAAGC AGTGAACGAA ATGACGCTAC TTGGGCTAAC TCCCGTACTT GCCCATATAG 
AGCGTTATGA TGCTCTGGCA TTTCAGTCAG AGAGAGTAGA AAAGCTAATT 
GACAAGGGAT GCTACACTCA GGTAAATAGT AACCATGTGT TGAAGCCTGC TTTAATTGGC 
GAACGAGCAA AAGAATTTAA AAAACGTACT CGATATTTTT TAGAGCAGGA 
TTTAGTACAT TGTGTTGCTA GCGATATGCA TAATTTATAT AGTAGACCTC CGTTTATGAG 
GGAGGCGTAT CAGCTTGTAA AAAAAGAGTA TGGTGAGGAT AGAGCGAAGG 
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CTTTGTTCAA GAAAAATCCT TTGTTGATAT TGAAAAATCA AGTACAGTAA CCTCATAGAA 
ATAGTGGAGG AGCTATGAAT ATTGAAATAG GATATCGCCA AACGAAATTG 
GCATTGTTTG ATATGATAGC AGTTACGATT TCTGCAATCT TAACAAGTCA TATACCAAAT 
GCTGATTTAA ATCGTTCTGG AATTTTTATC ATAATGATGG TTCATTATTT 
TGCATTTTTT ATATCTCGTA TGCCGGTTGA ATTTGAGTAT AGAGGTAATC TGATAGAGTT 
TGAAAAAACA TTTAACTATA GTATAATATT TGTAATTTTT CTTATGGCAG 
TTTCATTTAT GTTAGAGAAT AATTTCGCAC TTTCAAGACG TGGTGCCGTG TATTTCACAT 
TAATAAACTT CGTTTTGGTA TACCTATTTA ACGTAATTAT TAAGCAGTTT 
AAGGATAGCT TTCTATTTTC GACAACCTAT CAAAAAAAGA CGATTCTAAT TACAACGGCT 
GAACTATGGG AAAATATGCA AGTTTTATTT GAATCAGATA TACTATTTCA 
AAAAAATCTT GTTGCATTGG TAATTTTAGG TACAGAAATA GATAAAATTA ATTTACCATT 
ACCGCTCTAT TATTCTGTTG AAGAAGCTAT AGGGTTTTCA ACAAGGGAAG 
TGGTCGACTA CGTCTTTATA AATTTACCAA GTGAATATTT TGACTTAAAG CAATTAGTTT 
CAGACTTTGA GTTGTTAGGT ATTGATGTAG GCGTTGATAT TAATTCATTC 
GGTTTTACTG TGTTGAAGAA TAAAAAAATC CAAATGCTAG GTGACCATAG CATCGTCACT 
T.TTTCCACAA ATTTTTATAA GCCTAGTCAC ATCTGGATGA AACGACTTTT ' 
AGATATACTT GGAGCAGTAG TCGGGTTAAT TATTAGTGGT ATAGTTTCTA TTTTGTTAAT 
TCCAATTATT CGTAGAGATG GTGGGCCAGC CATTTTTGCT CAGAAACGAG 
TTGGACAGAA TGGACGCATA TTTACATTCT ACAAGTTTCG TTCGATGTTT GTTGATGCCG 
AGGTACGTAA GAAAGAATTA ATGGCTCAAA ACCAGATGCA AGGTGGGATG 
TTCAAAATGG ACAACGATCC TAGAATTACT CCAATTGGAC ACTTCATACG AAAAACAAGT 
TTAGATGAGT TACCACAATT TTATAATGTT CTAATTGGAG ATATGAGTCT 
AGTCGGTACC CGTCCGCCTA CAGTTGATGA ATTTGAAAAA TATACTCCTA GTCAAAAGAG 
AAGATTGAGT TTTAAACCAG GGATTACAGG TCTTTGGCAA GTGAGCGGAA 
GAAGTGATAT CACAGATTTT AATGAAGTCG TTAGGCTGGA CCTAACATAC ATTGATAATT 
GGACCATCTG GTCAGACATT AAGATTTTAT TGAAGACAGT GAAAGTTGTA 
TTGTTGAGAG AGGGAGGTCA GTAAGACTCC TTTAAAACAA AGAATAGTAG TAGGGGATAT 
GAGAACAGTT TATATTATTG GTTCAAAAGG AATACCAGCA AAGTATGGTG 
GTTTCGAGAC TTTCGTAGAA AAATTAACTG AGTATCAGAA AGATAAATCA ATTAATTATT 
TTGTTGCATG TACAAGAGAA AATTCAGCAA AATCAGATAT TACAGGAGAA 
GTTTTTGAAC ATAATGGAGC AACATGTTTT AATATTGATG TGCCAAATAT TGGTTCAGCA 
AAAGCCATTO TTTATGATAT TATGGCTCTC AAGAAATCTA TTGAAATTGC 
CAAAGATAGA AATGATACCT CTCCAATTTT CTACATTCTT GCTTGTCGGA TTGGTCCTTT 
CATTTATCTT TTTAAGAAGC AGATTGAATC AATTGGAGGT CAACTTTTCG 
TAAACCCAGA CGGTCATGAA TGGCTACGTG AAAAGTGGAG TTATCCCGTC CGACAGTATT 
GGAAATTTTC TGAGAGTTTG ATGTTAAAAT ACGCTGATTT ACTAATTTGT 
GATAGCAAAA ATATTGAAAA ATATATTCAT GAAGATTATC GAAAATATGC TCCTGAAACA 
TCTTATATTG CTTATGGAAC AGACTTAGAT AAATCACGCC TTTCTCCGAC 
AGATAGTGTA GTACGTGAGT GGTATAAGGA GAAGGAAATT TCAGAAAATG ATTACTATTT 
GGTTGTTGGA CGATTTGTGC CTGAAAATAA CTATGAAGTA ATGATTCGAG 
AGTTTATGAA ATCATATTCA AGAAAAGATT TTGTTTTGAT AACGAATGTA GAGCATAATT 
CCTTTTATGA GAAATTGAAA AAAGAAACAG GGTTCGATAA AGATAAGCGT 
ATAAAGTTTG TTGGAACAGT CTATAATCAG GAGCTGTTAA AATATATTCG TGAAAATGCA 
TTTGCTTATT TTCATGGTCA CGAGGTTGGA GGAACGAACC CATCTTTACT 
TGAAGCACTT TCTTCTACTA AACTAAATCT TCTTCTAGAT GTGGGCTTTA ATAGAGAAGT 
AGGGGAAGAA GGAGCGAAAT ACTGGAATAA AGATAATCTT CACAGAGTTA 
TTGACAGTTG TGAGCAATTA TCACAAGAAC AAATTAATGA TATGGATAGT TTATCAACAA 
AACAAGTCAA AGAAAGATTT TCTTGGGATT TTATTGTTGA TGAGTATGAG 
AAGTTGTTTA AAGGATAAGT TATGAAAAAG ATTCTATATC TCCATGCTGG AGCAGAATTA 
TATGGGGCAG ATAAGGTTCT CTTGGAACTT ATAAAAGGCT TAGATAAGAA 
TGAATTTGAA GCGCATGTTA TCCTACCTAA TGATGGAGTC CTAGTGCCAG CATTAAGAGA 
AGTTGGTGCG CAAGTTGAAG TTATTAACTA TCCAATTCTA CGTAGGAAAT 
ATTTTAATCC AAAAGGGATT TTTGACTACT T CAT AT CAT A TCATCACTAT TCTAAACAGA 
TTGCTCAATA TGCCATAGAA AATAAGGTTG ACATAATTCA CAATAATACT 
ACCGCTGTCT TAGAAGGCAT TTATCTGAAG CGAAAACTCA AATTACCTTT GTTGTGGCAT 
GTTCATGAGA TTATTGTCAA ACCTAAATTC ATCTCTGATT CGATCAATTT 
TTTAATGGGG CGTTTTGCTG ATAAGATTGT GACAGTTTCA CAGGCTGTGG CAAACCATAT 
AAAACAATCA CCTCATATCA AAGATGACCA AATCAGTGTA" ATCTACAATG 
GGGTAGATAA TAAAGTGTTT TATCAGTCCG ATGCTCGGTC TGTTCGAGAA AGATTTGACA 
TTGACGAAGA GGCTCTTGTC ATTGGTATGG TCGGTCGAGT CAATGCGTGG 
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AAAGGACAAG GAGATTTTTT AGAAGCAGTT GCTCCTATAC TCGAACAGAA TCCAAAAGCT 
ATCGCCTTTA TAGCAGGAAG TGCTTTTGAA GGAGAAGAGT GGCGAGTAGT 
AGAATTAGAA AAGAAGATTT CTCAATTAAA GGTCTCTTCT CAAGTCAGAC GAATGGATTA 
TTATGCAAAT ACCACTGAAT TATATAATAT GTTTGATATT TTTGTACTTC 
CAAGTACTAA TCCAGACCCT CTACCAACGG TTGTACTAAA AGCAATGGCA TGCGGTAAAC 
CTGTTGTCGG TTACCGACAT GGTGGTGTTT GTGAGATGGT GAAAGAAGGT . 
GTTAACGGTT TCTTAGTCAC TCCGAACTCA CCGTTAAATT TATCAAAAGT AATTCTTCAG 
TTATCGGAAA ATATAAATCT CAGAAAAAAA ATTGGTAATA ATTCTATAGA 
ACGTCAAAAA GAACATTTTT CGTTAAAAAG CTATGTAAAA AATTTTTCGA AAGTCTACAC 
CTCCCTCAAA GTATACTGAT TGGCTGAAGT GAATGCTTTA GTATAGCGAT 
TTATCGTATT CTCATTCGAT AAAACAAATG TTCAGAAACA GTTATAAGTT ATTTCTAAAG 
GGCACCTCTA TAAACTCCCA AAATTGCGAA TTTGGAGTTA CGAAAGCCTT 
GTTAAATCAA CATTTTAAAT TTTAGAAAAT TAGTTTTTAG AGCTCCCCTA AAATAGAAGA 
TAACAGAAGG GAGCCTTCAA AAACTTCATT TTTAATTGGA TTGTAGAAAA 
ACTGTTAAAT CAATATTTAG ATTTTTAGGA GTTCAGTTTT TGGGGGGAGA GCTTAATAAT 
CTATGCACTA TATTTCGAAA AATATATGGT GTAAAATCAG AACTGATGGT 
CGTGGCAAAA AAGAGAATGA GGAATTTATG AAAATTATTT CTTTTACAAT GGTTAATAAC 
GAAAGTGAGA TAATAGAGTC ATTTATACGG TATAATTATA ACTTTATTGA 
CGAGATGGTC ATTATTGATA ATGGTTGTAC AGATAACACG ATGCAAATTA TTTTTAATTT 
GATTAAAGAG GGATATAAAA TATCCGTATA TGATGAGTCT TTAGAGGCAT 
ATAATCAGTA TCGACTTGAT AATAAATATC TAACGAAAAT AATTGCTGAA AAAAATCCAG 
ATTTGATAAT ACCTTTGGAT GCGGATGAAT TTTTAACAGC CGATTCAAAT 
CCACGGAAAC TTTTGGAACA ACTGGACTTA GAAAAGATAC ATTATGTGAA TTGGCAATGG 
TTTGTTATGA CTAAAAAAGA TGATATTAAT GATTCGTTTA TACCACGTAG 
AATGCAATAT TGTTTTGAAA AACCTGTTTG GCATCATTCT GATGGTAAAC CAGTTACTAA 
ATGTATAATT TCCGCTAAGT ATTACAAAAA AATGAATTTA AAGCTATCGA 
TGGGACATCA CACTGTTTTT GGTAACCCAA ATGTAAGGAT AGAACATCAT AATGATTTGA 
AATTTGCACA TTATCGAGCT ATTAGCCAAG AGCAATTAAT TTATAAAACA 
ATTTGTTACA CTATTCGCGA TATTGCTACT ATGGAGAACA ATATCGAAAC AGCTCAAAGA 
ACAAATCAGA TGGCGCTCAT TGAATCTGGC GTGGATATGT GGGAAACGGC 
GAGAGAAGCC TCTTATTCAG GTTATGATTG TAATGTTATA CATGCACCAA TTGATTTAAG 
TTTTTGTAAA GAAAATATTG TAATAAAATA TAACGAACTA TCCAGAGAAA 
CAGTAGCAGA ACGCGTGATG AAAACGGGAA GAGAAATGGC TGTTCGTGCA TATAATGTGG 
AGCGAAAACA AAAAGAAAAG AAATTTCTAA AACCTATTAT ATTTGTATTA 
GATGGGTTAA AAGGAGATGA GTATATTCAT CCCAATCCAT CAAATCATTT GACGATCTTA 
ACTGAAATGT ATAACGTCAG AGGCTTACTT ACCGATAATC ACCAAATTAA 
ATTTCTCAAA GTTAATTATA GATTAATTAT AACTCCAGAT TTTGCTAAGT TTTTACCGCA 
TGAATTTATT GTTGTACCAG ATACCTTGGA TATAGAGCAA GTTAAAAGCC 
AGTATGTTGG TACAGGTGTA GACTTGTCAA AGATTATTTC TTTAAAAGAG TATCGAAAAG 
AGATAGGCTT TATTGGTAAT TTGTATGCGC TTTTAGGATT TGTTCCGAAT 
ATGCTCAATA GAATTTATCT ATATATTCAG AGAAACGGTA TTGCAAACAC TATTATAAAA 
ATCAAGTCGA GATTGTGAGA GTTGTTTACT TTTATTTGTA ATTTTAAAAG 
TAATGCAGGC AGATAGGAGA AAAACGTTTG GAAAAATGAG AATAAGAATT AATAATTTGT 
TTTTTGTTGC CATAGCGTTT ATGGGCATAA TTATTAGTAA TTCGCAAGTT 
GTTCTAGCGA TAGGCAAAGC TTCTGTGATT CAGTATCTAT CTTATTTAGT TTTGATTTTA 
TGTATAGTTA ATGATTTATT AAAAAATAAC AAACATATTG TAGTTTATAA 
ATTAGGGTAT TTGTTTCTTA TTATATTTTT ATTTACTATC GGAATATGTC AGCAAATTCT 
TCCTATAACA ACTAAAATAT ATTTATCAAT TTCAATGATG ATTATTTCAG 
TTTTAGCAAC GTTGCCAATA AGTTTGATAA AAGATATTGA TGATTTTAGA CGGATTTCAA 
ATCATTTGTT ATTCGCTCTT TTTATAACTT CGATATTAGG AATAAAGATG 
GGGGCAACGA TGTTCACGGG GGCAGTAGAA GGTATCGGTT TTAGTCAGGG TTTTAATGGA 
GGATTGACGC ATAAGAACTT TTTTGGAATA ACTATTTTAA TGGGGTTCGT 
ATTAACTTAC TTGGCGTATA AGTATGGTTC CTATAAAAGA ACGGATCGTT TTATTTTAGG 
ATTAGAATTG TTTTTGATTC TTATTTCAAA CACACGCTCA GTTTATTTAA 
TACTATTGCT TTTTCTATTT CTTGTTAATC TTGACAAAAT CAAAATAGAA CAAAGACAAT 
GGAGTACGCT TAAATATATT TCCATGC TAT TTTGTGCTAT TTTTTTATAC 
TATTTCTTTG GTTTTTTAAT AACACATAGT GATTCTTACG CTCATCGCGT TAATGGTCTT 
ATTAATTTTT TTGAGTATTA TAGAAATGAT TGGTTCCATC TAATGTTTGG 
TGCAGCGGAT TTGGCATATG GGGATTTAAC TTTAGACTAT GCTATAAGGG TTAGACGCGT 
TTTAGGTTGG AATGGAACGC TTGAAAT GCC CTTACTGAGT ATTATGTTAA 
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AAAATGGTTT TATCGGTCTG 
TAAGAATATT AAAAACAGAT 
ATCATTGTAG TCCTATCTGC 
ATGCCAATAT GTTTTTGTTT 
TATTAACAAA CAACTGCAAA 
TGGTAGAGCA TATGTTCTAT 
ATGATGATTT TTATGATAGC 
AGTCAGCATT ATTGTACCTA 
GTTTAGATAG CATTATTTCC 
GTTCTTCAGA TTCATCAACG 
GGTAGAATAA AACTTTTCCG 
ATCAAAAATA GCACAGCAAA 
TGTTGACGGC AACATTGTTG 
GTCGGGAGGG TTACTTGCTA 
TGCAAAAGTG TCAAATTGAT 
ATTTTCCCAA TCATTATATG 
CTTTATAAGA ATATATATAT 
TTATTATTTA ATCTAAATTA 
TAACAGAAAT CTTTATTTTG 
TGATGTTTTT ATTCAATTAG 
TTGTTAAAAT ATTTGGTGGA 
ATATTATTTA TTATAGCTTA 
CCAAAGAAAT TGCATATATT 
ATTAAACGAA CGTCCTCTGT 
TAATAATTTG TTTAAAATTT 
ACATTTCTAT CATCGTCCCA 
TGTATAAATA GCATTGTAAA 
GGTAGTACGG ATAATTCGGA 
TAGTCGCATT CGTTATTTTA 
CATAAGTCGC GCCAAGGGTG 
TTATTCATTC GGAGTTCATC 
TGGCAGTTGC TGGTTATGAT 
GCAGAGCCGC TTCCTACAAA 
CTAGAGGCGG ATGGTCATCG 
AAAAGAACTA TTTGAAGATT 
CACTTATCGC TTGCTCTATG 
GCTTGTACTA TTATGTTGAC 
GCTTCCATTG CCTACTGGAA 
AGTAGAGGAG ATAAAGAGCT 
TTGTTTTTAG GCAAATATAA 
TCTCCAAACG CTATTTAGAA 
ACTAATGAAT GCTTATTATT 
TCTTTCTGAA AACGGGGAAA 
CTCGGTAAGA ATGTTGTAAT 
AATCCAACAA ATAGTAGAAT 
GTTATTTTTA CACATCTGGA 
CTTTCCGTAT TTCGTTGACA 
AACGAATAAG TGGAATACAA 
GTGGTATAAA AGAAAGTATA 
TTTTATTGAG CTATTCGAGA 
GTTCTTCATC GCTCCGTTCA 
GTTGGGAGTT ACTATGTTCC 
AATTTTGTAT GTTCTTTTCG 
AAACTTTTCA GTGGATGCCA 
ATATATAGAC TAATATCACT 
AGCAGGATGT GCGTTCCAAG 
ATTATTGGAG CAATTCTGAT 
GGTTGGAAGT CTACTTCCTT 
ATTTTTTATG ATAAAGTATG 
TGCTTCTTAT CATATCTACT 
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GTAGGGTATG GGATTGTTTT 
AATATAAAAA CAATAGGAAA 
AACAGTAGAA AATTATATTG 
ATTAAATTCT ATATCTACTA 
CATAAATTGG CAGGAATAGA 
AGGTGGCAAG ATAAAGATAG 
AAAGCAAGTT ACGGCATAAA 
TTTTTAATAC GGAAAAGTAC 
CAATCGTATA CTAATCTAGA 
GATATATGTT TGGAATACGC 
GTTACCAAAT GGTGGTGTTT 
TTATATTATG TTTGTAGATT 
AGTCCTTATA CACCTGTTTA 
CTTTTGATGG AAATTATCAA 
TTGGAAGAGA TAAAAGAGGT 
AGCGGTATCT TTAATAGCCC 
AAACCAAGGT TTTGACACTG 
TTTAAAGAAT ATAAAAAAAG 
CCAGAAGAAG TTTACAAAGT 
AAAATTTAGA AGAAAAAACT 
CAATATGAAT TTTCTGTTTT 
TTAATGTTCA AAAATGGAGA 
TAAGTATTTA TACAATAGGC 
TTTTAAAAGA ATATGTAAAT 
TTTTAAATAC TTTAATTAGG 
ATTTACAATG TTGAACAATA 
TCAGACCTAC AAACATATAG 
AGAAATTTGT TTAGCATATG 
AAAAAGAGAA CGGCGGGCTA 
ACTACTTAGC TTTTATAGAC 
CAACGTTTAC ACGAAGCAAT 
AGGGTAGATG CTTCGGGGCA 
TCAGGCTGTT CTGAGCGGCA 
CTTTGTGGTG GCCTGGAATA 
TTCGATTTGA AAAGGGTAAG 
AGTTAGAAAA AGTTGCAATA 
CGAGAAAATA GTATCATAAC 
TTTCAAAATG AACGAATGGA 
CTTACTAGAG TGTTATCGTT 
TCATTGGTTG AGCAAACAGC 
TTGTATATAA ACAATTGAAG 
TGGTAGGGTG TCTTCATCTT 
GATAAAATTC AAGAAAGATT 
AAATGGTTGA AAGAAAAGGG 
AGCACTCTTT GATACGATTA 
TTGGTCTGTT GAGCAGCGTC 
TGGCTGTTCC AATTTTTCTG 
AACAAGAGAC GCTAAAGCTC 
AACATGCTTT GTCTCTATGC 
ACCATCTGAT AGGAGTAAAG 
TTTGTCCTGT GGCTACTTTC 
GTTGTTGATT CAGGTAGTTT 
AGAAAAATAA ATGGTTGGGC 
TATTTGCTAA CATGGCTGAA 
TCGTTATCTT TTTGTTCTAG 
GTAGATACTT TCATTGCGAC 
TTTTGTGAAT CATTCTATAG 
TCTATGCGTC CCATTTGCGT 
GACAGAAGAT TCCAGCAATA 
TGACCCAGAT GCTGTATTTT 



ATATAAACTT TATCGTAATG 
GTCTGTATTT 

TAAATTTAAG TTTTGTATTT 
TGGAATCAAC 

GTTTTGAGTT GCTATTAATT 
TATTTTTTAC 

AGGAATTAGA GGATGGAAAA 
TTAAGAGAGT 

GATTCTTTTG ATAGATGACG 
AGAGCAAGAT 

CAAACGCAAG GAATTACGGT 
CTGATGATAT 

AAAGAGAATG ATAGTGATTT 
GAATCTGAGC 

GCGAGACTTA GGAAATGAAA 
TTGTTGCAAA 

AACAGTGGTT AGGAGAGGAC 
TCCGCTATGT 

ACTACAAATA CGTTTAAATA 
TTTGATTTGT 

TAAAGAGACG CTACAGTGGC 
TGAATCGCTT 

ATTCTTTAGA TACTCTAAGT 
TAATTGTTGC 

GAAGAAAAAA ATAATGATTA 
TCTATCCAAG 

AGATTCTTCT GGTGAATGAC 
CGAAGAAAGA 

TCAGATGCCC GTAATTATGG 
TCAGATGATT 

TGAGAGAGAG AATGCCCTTG 
TTTCTTAACA 

GGAATGTTTG TAAAAAGCTG 
AACTCTATAA 

ATTCATGAAG ATGAATACTT 
GTTAAGGAGT 

TTCTAGTATG ACTGACCATC 
CTTCTATGAA 

CATTTTTAGC CTTTGCTGTT 
AAAAGAAGCT 

CAAAATAAGC GACTTGCTTT 
AATTTTAGTG 

GAGAAGAAGT GAAAGTAGTA 
GATTAAAATG 

AATGTATCAT GGTACTTTGT 
AATGGTTTAT 

TTGCTTTCTG CCTATTTTCG 
AAGTTCAGCA 

TATCGTGATG GCTGTTAATG 
CCTTTTTCAG 

TGGAGAATCG GGTCCAGGGA 
TTTTATTACC 

TTGCTTACTT GTTTTTTAGT 
CACGGCATAT 

GGCTTGGTTT TTTCTTTCAA 
CCTATTTGGG 

AGCCCTTCTC CTGGTTTTAT 
ATGCTATGCT 

CTGTTGTCAA AATTGGGAGT 
TCAGTAGTCG 
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CACCATTTTT AGCAGTGCAA TTTAAGGTAT CTTCGTTGAA TTTGTGGAAC GGCTTGTTTA 
CCTTTCTAAT TTGCCTGTTT GGTGGCTATA TTTTCTACAA AGTGGATCTG 
TTTATGAGAG TACGTGGAAA ACGATAATGA CTCATTTCAG ATTAGCAGAT GCCATTTCGT 
TTATTAGCAG ATTCGCATGT TAATATTCCG ACAAAGAAAT TCAAATAGGT 
TGACGAGAGA GGAGTGGTAT CTGTTTCTAA ACCCCAGTAT CCCCCTTTAT TTTCAAAGCT 
ATATTTATTA ACTGAACAAG GAGAATTTTT AAGAGAACTG TTTGTTTAAT 
CCCAGCACGA TCTGGTTCGA AAGGCTTACC GAATAAAAAC ATGCTATTTT TGGACGGGAA 
ACCCATGATT TTTCACACGA TTGATGTGGC AATTGAATCA GGTTGTTTTG 
AGAAAGAAGA CATCTATGTC AGTACGGATT CAGAAATGTA TAAGGGGGGC ACCTCTATAA 
ATTCCCAAAA TTGCGAATTT GGAGTTACGA AAGCCTTGTT AAATCAACAT 
CTTAAATTTT AGAAAATTAG TTTTTAGAGG TCCCCAAGGG GATTTGCGAG ACAAGAGGCA 
TCAATGTATT GTTAAGACCC AAAGAACTAT CTACTTATCA TACTCCATCG 
AATGAAGTCA GTACGCACTT TTTTACGAAT CTGGATTTTA TGAAGATTGT ATATTTGTTC 
TTCTGCAAGT CACCTCACCG TTACGGACTG GCGAACAGAT AAAAGAAGCC 
ATGAATATGT ACTTACAGGG GGACTCAGAA AATGTTTTGC ATTTCAATGA TGAAGGGCAA 
GAAAGAGTGA ATCAGTACAT TATCGAAGCT GTACAGGGGT TATAAAAAGG 
GGTTACTTAT CCTTAAAGTC TGTATGTAGA AGGAGAAAAA TTGAGACGAA TTTATATTTG 
CCATACGATG TATCAGATCC TGATTTCCTT GTTAAAGATG GACGTTGAGA 
GAGATAGTTT GATGTCCGTT GATATCATCG GGCATTTTCC AGATGTCAGG GAGCAACTGC 
AGCAGCATGT TCATCTAATC GAGGGAGACG GAGCGTTCAT TTGATCTATA 
TTCTTTGATA GCTAGATCAA AAACAAAAGA ACGCCTTTCC TTGTTACAGA GCTATGACGA 
GGTGATCATT TTTCAAGATC ACCGTCAAGT CGGTCATTTT TTAAATAAAC 
ATCGGATTCC CTATTCTCTT TTGGAGGATG GTTATAATTT TTTCAAGGAT AAAAGAGTGT 
GCGATTTGGA GTCAATTCAA TCATCTGTCT GGAAAAGACT CTTTTATCAA 
TGGTATTTTA AACCAACATA TTTGATTGGT TCAAGTCTCT ATTGTCAATC CATTGAGGTC 
AATGATCTGT CGCTCGTACA ATTTGACTAG GCTTATAAAC CCTTTGTAGA 
AGTTCCGAGA AAGCAATTAT TTGATCAAGC ATCGCCAGAG AAGGTGCAAG CGCTGCTGCA 
GATATTTGGA GCAAGGGCGA TAGTAGCGGA TGAAGAGTCT TCTCAAAAAC 
GATTGCTATT ATTGACCCAG CCCTTGTCTT GGG AT TATC A TGTGACCGAA GAGAGTTGTT 
GGAGATTTAT GTAGCAGGTC TTGCCCCTTA TCGGGAAGAC TATACAATCT 
ACATAAAACC GCACCCACGA GATGGGGTTG ATTATTCATT TCTGGGTAAG GCTGTGGTGC 
TTCTGCCTCA AGGTATTCCG TTTGAGTTGT TCGAAATGGC AGGTAATATC 
CGTTTTGATA TCGGTATGAC CTATAGTTCG TCTGCTTTAG ATTTTTTAAA TTGTTTTGAA 
GAGAAAGTGT ATTTAAAGGA CACTTTTCCT CTTCTTTCAA AAAATGATAT 
TTTGCGTGAG GGGATAGAAT AGGAGGATTC ATGTCTAAAA AATCAATAGT TGTCTCAGGT 
CTCGTCTATA CGATTGGAAC CATCCTCGTT CAGGGATTAG CCTTCATTAC 
CCTCCCCATC TATACTCGTG TCATTTCTCA GGAAGTATAT GGGCAGTTTA GCTTGTATAA 
TTCGTGGGTG GGGCTAGTTG GTCTCTTTAT CGGTCTACAG TTAGGTGGGG 
CTTTTGGCCC GGGATGGGTA CACTTCCGCG AGAAATTTGA TGATTTCGTA TCCACCTTGA 
TGGTCTCTTC TATCGCTTTC TTTTTACCAA TTTTTGGGCT ATCTTTTCTC 
CTCAGTCAGC CCCTATCGCT CCTATTTGGT TTGCCTGATT GGGTCGTTCC GCTTTACTTT 
TTGCAAAGTT TTATGAGTGT TGTGCAAGGA TTTTTTACGA CCTATTTAGT 
GCAGCGGCAG CAGTCCATGT GGACTTTACT CCTATCGGTA CTGAGCGCTG TTATCAACAC 
TGCTTTATCT TTATTTCTCA TCTTTTCGAT GGAGAATGAT TTCATCGCTC 
GTGTAATGGC AAACTCGGCA ACGACTGGTG TTTTTGCTTG TGTGTCCTTG TTGTTTTTCT 
ATAAGAAGAT TGGGCTTCAT TTTCGAAAGG ACTATCTTCG GTATGGTTTA 
AGTATATCGA TTCCTCTTAT TTTTCATGGA TTAGGTCATA ATGTACTCAA TCAATTTGAC 
AGAATCATGC TCGGCAAGAT GCTAACACTG TCAGATGTAG CCCTATACAG 
TTTCGGCTAC ACACTTGCGT CTATCTTACA AATTGTGTTT TCGAGCTTGA ATACGGTATG 
GTGTCCGTGG TATTTTGAGA AAAAGAGAGG TGCAGATAAA GATTTGCTCA 
GTTATGTCCG TTACTATCTG GCGATTGGCC TGTTTGTGAC TTTTGGATTT CTAACAATTT 
ACCCTGAATT AGCGATGTTG TTAGGTGGAT CTGAGTATCG TTTCAGTATG 
GGATTTATTC CCATGATTAT TGTCGGGGTG TTCTTTGTAT TTCTTTATAG TTTTCCAGCC 
AATATCCAGT TTTATAGTGG AAATACAAAG TTTTTGCCAA TTGGTACTTT 
TATAGCAGGT GTACTAAATA TTTCCGTCCA CTTTGTTTTG ATACCGACAA AGAATTTATG 
GTGCTGCTTT GCAACGACTG CTTCCTATCT GTTGTTGCTA GTCTTGCATT 
ATTTTGTTGC TAAGAAAAAG TATGCTTACG ATGAAGTTGC GATTTCAACA TTTGTTAAGG 
TAATTGCTCT TGTTGTCGTC TATACAGGCT TGATGACAGT ATTTGTCGGT 
TCAATCTGGA TTCGTTGGTC ACTAGGAATA GCGGTTCTAG TCGTTTATGC CTACATTTTT 
AGAAAGGAAT TAACAGTTGC CCTCAATACA TTCAGGGAAA AACGGTCTAA 
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ATAAGGGCAC CTCTATAAAC TCCCAAAATT GCGAATTTGG AGTTACGAAA GCCTTGTTAA 
ATCAAACATT TTAAATTTTA GAAAATTAGT TTTTAGAGGT CCCCATATAA 
AAACGTCCCA AATGAGAGGT GCTCATAAGA ATTGACCATC ACTGCCATCT ACCCAAAGTT 
CAAGTATTCT CTACCATGAA AATTGTGCTA TAATCAAGTA TAAAGAAGGG 
AATGTTTCTT AAAGGACGTA TGCGCCTCTG CTTATGCCAG AAGTCATGAG GTAAATCTCC 
CTAAAAATTG GGTAGAAAAG CAGATTAAAC TTCCACCAAT CTATTGAAGA 
TCGTGTTGAA GAGCAGGCTT TAGAAGCAAC AAGCCCTGAG ACTATTCGAA AGAAATCTAG 
GGCTATTTTT TCTAATCGGC TATCAGAAGT GAAGTAGCGA TCTTTATTAG 
TGTTCTTTTA CTACTTAAGG AAAACCAAGC TGCTCCCTCA AGACTTTATG GGAGCGATTT 
ACAGTCATTT TTAGAAAGGA AATAAAATGG TTTATATTAT TGCAGAAATT 
GGTTGTAATC ACAACGGTGA TGTTCATCTA GCACGGAAAA TGGTAGAAGT TGCCGTTGAT 
TGTGGTGTGG ATGCCGTTAA ATTTCAGACA TTTAAGGCAG ATTTGTTGAT 
TTCAAAATAC GCACCAAAGG CCGAATACCA AAAAATTACA ACAGGAGAGT CAGATTCTCA 
GCTCGAAATG ACTCGTCGTT TGGAATTGAG CTTTGAAGAG TATCTTGATT 
TGCGTGATTA CTGTCTTGAA AAGGGAGTTG ATGTGTTTTC GACACCTTTT GATGAGGAAT 
CATTGGACTT CTTGATTAGC ACAGATATGC CCGTTTATAA GATTCCATCT 
GGTGAGATTA CCAATCTTCC CTATTTGGAA AAAATTGGTC GTCAAGCTAA GAAAGTTATT 
CTTTCAACTG GTATGGCTGT TATGGATGAA ATTCATCAAG CGG TGAAGAT 
TTTGCAGGAA AATGGAACGA CCGATATTTC GATTTTGCAT TGTACAACCG AGTATCCAAC 
CCCTTACCCT GCTTTGAATT TGAATGTCTT GCATACCTTG AAAAAAGAAT 
TTCCAAACTT AACAATTGGC TATTCAGACC ATAGTGTTGG TTCAGAAGTA CCCATCGCTG 
CTGCAGCAAT GGGAGCTGAA TTGATTGAAA AGCACTTTAC TCTGGACAAT 
GAAATGGAAG GACCAGATCA TAAAGCGAGT GCTACTCCTG ATATCTTAGC AGCCTTGGTA 
AAAGGAGTGA GGATAGTGGA ACAATCTCTT GGTAAATTTG AAAAAGAGCC 
AGAAGAAGTT GAAGTACGAA ATAAAATTGT AGCTAGAAAA TCTATTGTTG CCAAAAAAGC 
AATTGCTAAA GGCGAAGTCT TTACAGAAGA AAACATCACT GTCAAAAGAC 
CAGGAAATGG AATTTCGCCA ATGGAATGGT ACAAAGTCTT GGGGCAGGTG AGTGAGCAGG 
ATTTTGAGGA AGACCAAAAT ATTTGCCATA GTGCTTTTGA AAATCAAATG 
TAAGCGGAGT AAGGATGAAA AAAATTTGTT TTGTGACAGG CTCTCGTGCC GAATATGGGA 
TTATGCGTCG CTTATTGAGC TATCTACAGG ATGATCCAGA AATGGAGCTG 
GATCTTGTAG TGACAGCCAT GCATCTAGAA GAAAAATATG GGATGACGGT CAAAGACATC 
GAAGCGGACA AGCGTAGGAT TGTCAAGCGG ATTCCATTGC ATTTGACGGA 
TACGTCTAAG CAGACAATCG TCAAATCTTT AGCGACCTTG ACAGAGCAAC TCACGGTTCT 
TTTTGAAGAA GTCCAGTATG ACTTGGTGTT GATTCTGGGG GATCGCTATG 
AGATGCTACC AGTTGCCAAT GCTGCGTTGC TTTATAATAT TCCTATTTGC CATATTCATG 
GTGGTGAAAA AACCATGGGA AATTTTGATG AGTCGATTCG CCATGCCATT 
ACCAAGATGA GTCACCTTCA TCTGACATCA ACGGATGAAT TTAGAAATCG TGTCATTCAA 
CTAGGAGAAA ATCCAACCAT GTACTGAACA TCGGAGCTAT GGGTGTTGAA 
AATGTTTTAA AACAAGACTT TTTGACAAGA GAAGAGTTGG CGATGGAACT TGGAATTGAT 
TTTGCCGAGG ATTACTATGT TGTACTCTTT CACCCTGTTA CCTTGGAGGA 
TAACACAGCC GAAGAACAAA CGCAGGCCTT ATTAGATGCT CTAAAAGAAG ATGGTAGCCA 
GTGTTTGATA ATTGGATCCA ATTCGGATAC ACATGCCGAT AAGATAATGG 
AATTGATGCA TGAATTTGTA AAACAAGACT CTGATTCTTA CATCTTTACT TCGCTTCCAA 
CTCGTTATTA CCATTCCTTG GTCAAGCATT CACAAGGTTT AATAGGGAAT 
TCTTCGTCAG GTTTGATTGA AGTGCCCTCA TTACAGGTTC CGACCTTAAA TATTGGAAAT 
CGCCAATTTG GACGTTTGTC AGGACCGAGT GTGGTACATG TTGGAACTTC 
TAAGGAAGCG ATTGTTGGTG GTTTGGGGCA ATTACGTGAT GTGATAGATT TTACCAATCC 
ATTTGAACAA CCTGATTCTG CTTTACAAGG TTATCGAGCT ATCAAGGAAT 
TTTTATCTGT ACAGGCCTCA ACCATGAAAG AGTTTTATGA TAGATAGGGG AGAAAGTTTG 
ATGAAAAAAG TAGCCTTTCT AGGAGCGGGT ACCTTTTCAG ATGGTGTCCT 
TCCTTGGTTG GATAGAACTC GATATGAACT CATTGGATAT TTTGAAGATA AACCGATCAG 
TGACTATCGT GGCTATCCTG TATTTGGTCC CTTGCAAGAT GTCCTAACCT 
ATTTGGATGA TGGAAAAGTA GATGCTGTCT TCGTCACTAT AGGTGACAAT GTCAAGCGCA 
AGGAAATCTT TGACTTGCTT GCCAAAGATC ATTATGATGC TTTGTTCAAC 
ATCATTAGCG AGCAAGCCAA TATTTTTTCC CCAGATAGTA TCAAGGGACG AGGGGTTTTC 
ATAGGTTTTT CAAGTTTTGT AGGAGCCGAT TCCTATGTCT ATGACAATTG 
TATCATCAAT ACGGGTGCCA TTGTGGAACA TCATACCACG GTGGAGGCCC ATTGTAACAT 
TACTCCAGGA GTGACCATAA ATGGCTTGTG CCGTATCGGA GAAAGCACTT 
ATATTGGAAG TGGTTCAACA GTGATTCAAT G TAT C GAG AT TGCACCTTAT ACAACATTGG 
GGGCAGGGAC AGTTGTTTTG AAATCGTTGA CGGAGTCAGG GACCTATGTT 
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GGTGTACCTG CTAGAAAGAT TAAATAGGTG 
CTCGGTCAGG ATCAAAAGGT TTACCAAATA 
GGTGTACCGA TGATTTTCCA TACCATTCGA 
GAAAATATAT ATGTCAGTAC TGATTCAGAG 
AACAACTGGG GTTCAAGTCC TCATGCGTCC 
TTTTCAACTG AACGAACATT TTTTACAAGA 
TTGTTCTCCT GCAAGTTACG TCCCCATTAA 
AGTTATATGG GAAAGGTCAA GCTGACCACG 
GATAAGTCTC CAACATTGTT TTCAACTTTA 
GGATTAGGTG GCAGTTATCG TCGTCAAGAT 
TAATGGAGCG ATTTATATTT CTTCTAAGCA 
TGAAAAAACA GCGGCCTATG TGATGACGAA 
ATGATCACTT TGATTTTACT GGTGTTATTG 
AGCAACAAAA CAAACCATTT TATAAAAGAG 
CAACGAGTCC ATGATAGTCT TGTGATTGGC 
GGTTTCGATA ATATCAGCAT CGGTGGGATG 
AAACCAAGGT CTCTTTTTGG CTACTCCGAT 
TGATTTGATT ACTGACTATC CCTTGCATAT 
AGCTGATGGA AAGTCTTGTT TCCAAAGCAG 
CGCTGTTTCG TGATAGCGTT TCCAATGAAG 
GTTATTGTTC AGTCAGCAAG TGAACTGGGT 
GAAAAAGAGG CGATGCTTGA CTATCAGTAT 
CAATCAGATT GGACAAGAGC GTGTGAATCA 
TGGTGATAGA AGCTATTTCA GTGGCTAGAC 
AGCCCAGGAA TAACATCTGT AGAGGATGCT 
TTTAATTATA TAAGGGGACC TCTAAAAACT 
AGATAATAGA ATAAAAAGTA ATGAGGAGAG 
AAATCTTGTC AAAACTATCA GAGAAAGGCA 
GCCGTTATGG ATTGGAATAT CTTTCTTCCA 
AAAGTCATCA GTCGTGGCGG TCGTCCTCAC 
CAAAGCGCTC TTGCTTCAAC GTCTTCATAA 
GCTGGATCGT ATATCTTTTC GTCGTTTTGT 
TTCCCGATGC GAAAACTATC TGGCTCTATC 
AGGAGTTGTT CGATTTGTTC TATGCCCATC 
GCCCATTCAG GTCAGATTGT GGATGCTACC 
CGTGAGGACA ATCAGAAAAT CAAAACTTAT 
CAGCTAGTGT ACACGACTCC AATGTCCTAG 
TTGATGACAG TGCTTATGTT GGAAAATCAG 
CACACGATTC GTCGTGCTTT TAGAAATAAA 
CGACATATTA CCAAAGTCCG TTGTCGCGTT 
TGAAACTAAC ATGAAAGGTA ACATCTGTCG 
TGTGACCTTA ACCAACCTGC TCTACAATAT 
AACGACTGGG ATTACCATCC GTGGGCTTAG 
AGAGGCTGGG CAAAAACTAG TTTCTCACAA 
AACTGTAGTG GGTAGACGAA AAGCTAACAC 
TTGATGTTTA AAGCGTAACC GCCTAATAAC 
CATTCCTCCA TTATATAGTT AAATGAAACA 
ATGGCATATT CATTAGATTT TCGTAAAAAA 
AACCGGCAGT ATTACTGAAG CATCAGCTAT 
ATGGCTAAAA TTAAAAGAGA AAACCGGCGA 
GAACCAAGCC AAGAAAAGTG GATAGAGATA 
ATGCTTATTT GACTGAAATA GCTTCTGAAT 
ATTCATTACC CCCTCAAAGC TATGGGATAT 
AACAAGACCC TGAAAAAGTA GAACTGTTCC 
AGCCACTTGA CTCCTGTTTA TAT TG AC GAG 
TATGGTCGCT CTTTGAAAGG TCAGTTGATA 
AAGATACCAG CGGATATCTT TAGTAGCAGG 
GACATACAAA GATACTATGA CGAGTGGCTT 



AATTGATGGA ACCAATTTGT CTGATTCCTG 
AAAACATGTT ATTTTTAGAT 
GCTGCGATTG AGTCTGGATG TTTTAAGAAA 
GTTTACAAGG AAATTTGTGA 
AGCTGACTTG GCGACAGATT TTACAACCTC 
TTTTTCTGAT GACCAAGTAT 
GATCGGGAAA ACATGTCAAG GAGGCGATGG 
TTGTTAGCTT TACCAAAGTC 
GACGAAAACG GATTCGCTAA GGATATTGCA 
GAGAAAACAC TCTACTATCC 
GGCTTATTTA GCGGATAAAA CTTATTTTTC 
GGAAGATTCG ATTGATGTAG 
GTCGAATTTA CTTTGATTAC CAGCGTCGTG 
AGTTAAAGCG TTTATGTGAG 
GATAGTCGTC TGTTAGCCTT GTTACTGGAT 
ACAGCTTCGA CAGCACTTGA 
AAAGAAAGTT TTGCTTTCTC TTGGTGTGAA 
GATTGAGGAT ACTATTCGCC 
AGCAGGTTTT TGTGACGACG ATTGCCTACA 
AAATTGTGCA GCTGAATGAC 
ATTTCAGTGA TTGATCTAAA TGAAGTTGTT 
ACCAATGATG GATTGCATTT 
GCTGATTTTG ACAAGTTTGA CAAGATAATT 
TATGTTGGTA TGTGTTTTAG 
AGCCTTGAGA ATTGACAACC ATTTAGTTGT 
CCCTAAATTT CCCAAAAATG 
CTGTCATGCA TTTATTCACA GACGATGAAA 
ATCCCTTAGA ACGTTTGGAT 
TTGTTGTCAG AGTTATTCAG TCGTAAAGAT 
CTAGACTATC TCATGATGTT 
CCTATCTGAC GATGCCATGG AATATCAACT 
TGGTTGTCAT GAAGACACTG 
GTGAGAAATT AACCAAGTCA GGTCGTGAAA 
TCACAGATGA AGGGGTGATT 
TTTGTCGAAT GCCCTAAACA ACGCAATTCA 
CGAAAATTAT GAGGTCACAA 
CTCCTCTTTG TGATGCCAAT GAAGCGGTTT 
TACCAGAAGG TTGTCGCCAC 
CCGTTGACTG AGACTGATAA GGTCATTAAT 
GAGCATGGTT TTGGCTTCAT 
AGCAATTGGG AAGGCACGAG CTGAAACCAA 
CTGTCGTTTT GAGCAAATCA 
TGCGCCCAAA AAATAGGAAA ATAAGCAAAA 
TAAAAAAACG GCTCTTTGTC 
CTAGAGAGGA CGAAATTCGT TCTCTCATTT 
AAGGTATCTA TCCAATCACA 
AAAACAGTAC ATCTATGATA TAATGTATTT 
GTTCTCGCAT ACTGTGAGAA 
TTTCCAAGTT TCACGTAACA CTATCTATCA 
GCTTCATCAC CAAGTTAAAG 
AATTAAAGAA TTATCTTGAA ACTCATCCAG 
TTGACTGTCA TCCAACAGCT 
ACTCGAAAAA AAAGAGCTGT ACCTACTATG 
TTAAAGAATT GAATAACTTA 
ACAGGGTTTG AGACATATTT TCATCGAAAA 
AAAGGTAAGG TCTCTGGAAG 
TCTCATAAAT GGTGCGCTTA TAGCCCCGAT 
TTTCGAAGCT T 
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SLDI DHMMEVMEASKSAAGSAC PS PQAYQAAFEG AEN I IWT I TGGLSGS FNAARVARDM 
YIEER PN VNIHLI DSJjSASGEMDLLVHQI NRLI SAGLDFPQVVEAI TH YREHSKLLFVLA 
KVDNLVKNGRLSKLVGTWGLLNIRMVGEASAEGKLELLQKARGHKKSVTAAFEEMKKAG 
yDGGRIVMAHRNNAKFFQQFSELVKASFPTAVIDEVATSGLCSFYAEEGGLLMGYEVKA 
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MKKYQVI I QD I LTG I EEHRFKRGEKLPS I RQLREQ Y HCS KDT VQKAMLELKYQNKI YAVE 
KSGYYILEDRDFQDHTCRAQSYRLSRITYEDFRICLKESLIGRENYLFNYYHQQEGLAEL 
ISSVQSLLMDYHVYTKKDQLVITAGSQQALYILTQMETLAGKTEILIENPTYSRMIELIR 
RQG1 PYQT I ERNLDG I DLEELES I FQTGKIKFFYTI PRLHNPLGSTYDIATKTAIVKLAK 
Q YDVYI I EDDYLADFDSSHSLPLH YLDTDNRVI YI KS FT PTLFPALRI GAI SLPNQLRDI 
FIKHKSLIDYDTNLIMQKALSLYIDNGMFARNTQHLHHIYHAQWNKIKDCLEKYALNIPY 
RIPKGSVTFQLSKGILSPSIQHMFGKCYYFSGQKADFLQIFFEQDFADKLEQFVRYLNE 
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MKI 1 1 PNAKEVNTNLENAS FYLLS DRSKPVLDAI SQFDVKKMAAFYKLNEAKAELEADRW 
YRI RTGQAKT Y PAWQL YDGLM YRYMDRRG I DS KEENYLRDHVRVAT ALYGL IHPFEFISP 
HRLDFQGSLKIGNQSLKQYWRPYYDQEVGDDELILSLASSEFEQVFSPQIQKRLVKILFM 
EEKAGQLKVHSTISKKGRGRLLSWLAKNNIQELSDIQDFKVDGFEYCTSESTANQLTFXR 
SIKM 
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MKKRSGRSKSSKFKLVNFALLGLYS I TLCLFLVTMYRYNI LDFRYLNYI VTLLLVGVAVL 
AGLIMiRKKARI FTALLLVFSLVT TS VGI YGMQE WKFS TRLNSNS T FSE YEMSILVPAN 
SDITDVRQLTSILAPAEYDQDNITALLDDISKMESTQLATSPGTSYLTAYQSMLNGESQA 
MVFNGVFTNIXiENEDPGFSSKVKKIYSFKVTQTVETATKQVSGDSFNIYISGIDAYGPIS 
TVSRSDVNIIMTVNRATHKILLTTTPRDSYVAFADGGQNQYDKLTHAGIYGVNASVHTLE 
NFYGIDISNYVRLNFISFLQLIDLVGGIDVYNDQEFTSLHGNYHFPVGQVHLNSDQALGF 
VRERYSLTGGDNDRGKNQEKVIAALIKKMSTPENLKNYQAILSGLEGSIQTDLSLETIMS 
LVNTQLESGTQFTVESQALTGTGRSDLSSYAMPGSQLYMMEINQDSLEQSKAAIQSVLVE 
K 
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MNNQEVNAI E I DVLFLLKT I WRKKFL I LLTAVLTAGLAFVYS S FLVTPQY DST TRI YWS 
QWEAGAGLmQELQAGTYLAKDYREI ILSQDVLTQVATELNLKESLKEKIS VSI PVDTR 
IVSISVRDADPNEAARIANSLRTFAVQKWEVTKVSDVTTLEEAVPAEEPTTPNTKRNIL 
LGLLAGGILATGLVLVMEVLDDRVKRPQDIEEVMGLTLLGIVPDSKKLK 
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MAMLEIARTKREGVNKTEEYFNAIRTNIQLSGADIKWGITSVKSNEGKSTTAASLAIAY 
ARSGYKTVLVDADIRNSVMPGFFKPITKITGLTDYLAGTTDLSQGLCDTDIPNIiTVIESG 
KVSPNPTALLQSKNFENLLATLRRYYDYVIVDCPPLGLVIDAAIIAQKCDAMVAWEAGN 
VKCSSLKKVKEQLEQTGTPFLGVILNKYDIATEKYSEYGNYGKKA 
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MIDIHSHIIFGVDDGPKTIEESLSLISEAYRQGVRYIVATSHRRKGMFETPEKI IMINFL 
QLKEAVAEVYPEIRLCYGAELYYSKDILSKLEKKKVPTLNGSCYILLEFSTDTPWKEIQE 
AVNEMTLLGLTPVLAHIERYDALAFQSERVEKLIDKGCYTQVNSNHVLKPALIGERAKEF 
KKRTRYFLEQDLVHCVASDMHNLYSRPPFMREAYQLVKKEYGEDRAKALFKKNPLLILKN 
QVQ 
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MNIEIGYRQTKLALFDMIAVTISAILTSHIPNADLNRSGIFIIMMVHYFAFFISRMPVEF 
E YRGNL I E FEKT FNYS 1 1 FVI FLMA VS FMLENNFALS RRGA V YFTL I NFVL VYL FNVT I K 
QFKDSFLFSTTYQKKTILITTAELWENMQVLFESDILFQKNLVALVILGTEIDKINLPLP 
LYYSVEEAIGFSTREWDYVFINLPSEYFDLKQLVSDFELLGIDVGVDINSFGFTVLKNK 
KIQMLGDHSIVTFSTNFYKPSHIWMKRLLDILGAWGLIISGIVSILLIPIIRRDGGPAI 
FAQKRVGQNGRIFTFYKFRSMFVD7VEVRKKELMAQNQMQGGMFKMDNDPRITPIGHFIRK 
TSLDELPQFYNVLIGDMSLVGTRPPTVDEFEKYTPSQKRRLSFKPGITGLWQVSGRSDIT 
DFNE WRL DL T Y I DNWT I WS DI KI LLKT VKWLLRE GGQ 
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MRTVYIIGSKGIPAKYGGFETFVEKLTEYQKDKSINYFVACTRENSAKSDITGEVFEHNG 
ATCFNIDVPNIGSAKAILYDIMALKKSIEIAKDRNDTSPIFYILACRIGPFIYLFKKQIE 
SIGGQLFVNPDGHEWLREKWSYPVRQYWKFSESLMLKYADLLICDSKNIEKYIHEDYRKY 
APETSYIAYGTDLDKSRLSPTDSWREWYKEKEISENDYYLWGRFVPENNYEVMIREFM 
KSYSRKDFVLITNVEHNSFYEKLKKETGFDKDKRIKFVGTVYNQELLKYIRENAFAYFHG 
HEVGGTNPSLLEALSSTKLNLLLDVGFNREVGEEGAKYWNKDNLHRVIDSCEQLSQEQIN 
DMDSLSTKQVKERFSWDFIVDEYEKLFKG 
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MKKILYLHAGAELYGADKVIJaELIKGLDKNEFEAHVILPNDGVLVPALREVGAQVEVINY 
PI LRRKY FN PKG I FDYFI S YHH YS KQI AQ YAI ENKVDI I HNNT T AVLEG I YLKRKLKLPL 
LWHVHEIIVKPKFISDSINFLMGRFADKIVTVSQAVANHIKQSPHIKDDQISVIYNGVDN 
KVFYQSDARSVRERFDIDEEALVIGMVGRVNAWKGQGDFLEAVAPILEQNPKAIAFIAGS 
AFEGEEWRWELEKKISQLKVSSQVXRMDYYANTTELYNMFDIFVLPSTNPDPLPTWLK 
AMACGKPWGYRHGGVCEMVKEGVNGFLVTPNSPLNLSKVILQLSENINLRKKIGNNSIE 
RQKEHFSLKSYVKNFSKVYTSLKVY 
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MKIISFTMVNNESEIIESFIRYNYNFIDEMVIIDNGCTDNTMQIIFNLIKEGYKISVYDE 
SLEAYNQ YRLDNKYLTKI I AEKNP DL 1 1 PLDADE FLTADS N PRKLLEQLDLEKIH YVNWQ 
WFVMTKKDDINDS FIPRRMQYCFEKPVWHHSDGKPVTKCI ISAKYYKKMNLKLSMGHHTV 
FGNPNVRIEHHNDLKFAHYRAISQEQLIYKTICYTIRDIATMENNIETAQRTNQMALIES 
GVDMWETAREASYSG YDCNVIHAP I DLSFCKENI VI KYNELS RETVAERVMKT GREMAVR 
AYNVERKQKEKKFLKPIIFVLDGLKGDEYIHPNPSNHLTILTEMYNVRGLLTDNHQIKFL 
KVNYRLIITPDFAKFLPHEFIWPDTXDIEQVKSQYVGTGVDLSKIISLKEYRKEIGFIG 
NLYALLGFVPNMLNRIYLYIQRNGIANTIIKIKSRL. 
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MQADRRKTFGKMRIRINNLFFVAI AFMG 1 1 I SNSQWLAIGKAS VTQ YLS YL VLI LCI VN 
DLLKNNKHI WYKLGYLFLI I FLFTI GI CQQI LPI TTKI YLS I SMMI I S VLATLPI SLIK 
DIDDFRRISNHLLFALFITSILGIKMGATMFTGAVEGIGFSQGFNGGLTHKNFFGITILM 
GFVLTYLAYKYGSYKRTDRFILGLELFLILISNTRSVYLILLLFLFLVNLDKIKIEQRQW 
STLKYISMLFCAIFLYYFFGFLITHSDSYAHRVNGLINFFEYYRNDWFHLMFGAADLAYG 
DLTLDYAIRVRRVLGWNGTLEMPLLSIMLKNGFIGLVGYGIVLYKLYRNVRILKTDNIKT 
IGKSVFIIWLSATVENYIVNLSFVFMPICFCLLNSISTMESTINKQLQT 
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MEKVSIIVPIFNTEKYLRECLDSIISQSYTNLEILLIDDGSSDSSTDICLEYAEQDGRIK 
LFRLPNGGVSNARNYGIKNSTANYIMFVDSDDIVDGNIVESLYTCLKENDS DLSGGLLAT 
FDGNYQESELQKCQIDLEEIKEVRDLGNENFPNHYMSGIFNSPCCKLYKNIYINQGFDTE 
QWLGEDLLFNLNYLKNIKKVRYVNRNLYFARRSLQSTTNTFKYDVFIQLENLEEKTFDLF 
VKIFGGQYEFSVFKETLQWHIIYYSLLMFKNGDESLPKKLHIFKYLYNRHSLDTLSIKRT 
SSVFKRICKLIVANNLFKIFLNTLIREEKNND 
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MINISIIVPI YNVEQYLSKC INSIVNQTYK HIEILLVNDG STDNSEEICL AYAKKDSRIR 
YFKKENGGLS DARNYGISRA KGDYLAFIDS DDFIHSEFIQ RLHEAIEREN 
ALVAVAGYDR VDASGHFLTA EPLPTNQAVL SGRNVCKKLL EADGHRFWA WNKLYKKELF 
EDFRFEKGKI HEDEYFTYRL LYELEKVAIV KECLYYYVDR ENSIITSSMT 
DHRFHCLLEF QNERMDFYES RGDKELLLEC YRSFLAFAVL FLGKYNHWLS KQQKKLLQTL 
FRIVYKQLKQ NKRLALLMNA YYLVGCLHLN FSVFLKTGKD KIQERLRRSE 
SSTR 
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MSKKSIWSG LVYTIGTILV QGLAFITLPI YTRVISQEVY GQFSLYNSWV GLVGLFIGLQ 
LGGAFGPGWV HFREKFDDFV STLMVSSIAF FLPIFGLSFL LSQPLSLLFG 
LPDWWPLIF LQSLMIWQG FFTTYLVQRQ QSMWTLPLSV LSAVINTALS LFLTFPMEND 
FIARVMANPA TTGVLACVSX WFSQKKNGLH FRKDYLRYGL SISIPLIFRG 
LGHNVLNQFD RIMLGKMLTL SDVALYSFGY TLASILQIVF SSLNTVWCPW YFEKKRGADK 
DLLSYVRYYL AIGLFVTFGF LTIYPELAML LGGSEYRFSM GFIPMIIVGV 
FFVFLYSFPA NIQFYSGNTK FLPIGTFIAG VLNISVHFVL IPTKNLWCCF ATTASYLLLL 
VLHYFVAKKK YAYDEVAIST FVKVIALVW YTGLMTVFVG SIWIRWSLGI 
AVLWYAYIF RKELTVALNT FREKRSK 
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MVYIIAEIGC NHNGDVHLAR. KMVEVAVDCG VDAVKFQTFK ADLLISKYAP KAEYQKITTG 
ESDSQLEMTR RLELSFEEYL DLRDYCLEKG VDVFSTPFDE ESLDFLISTD 
MPVYKIPSGE ITNLPYLEKI GRQAKKVILS TGMAVMDEIH QAVKILQENG TTDISILHCT 
TEYPTPYPAL NLNVLHTLKK EFPNLTIGYS DHSVGSEVPI AAAAMGAELI 
EKHFTLDNEM EGPDHKASAT PDILAALVKG VRIVEQSLGK FEKEPEEVEV RNKIVARKSI 
VAKKAIAKGE VFTEENITVK RPGNGISPME WYKVLGQVSE QDFEEDQNIC 
HSAFENQM 
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MKKICFVTGS RAEYGIMRRL LSYLQDDPEM ELDLWTAMH LEEKYGMTVK DIEADKRRIV 
KRIPLHLTDT SKQTIVKSLA TLTEQLTVLF EEVQYDLVLI LGDRYEMLPV 
ANAALLYNIP ICHIHGGEKT MGNFDESIRH AITKMSHLHL TSTDEFRNRV IQLGENPTMY 
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MELGIDFAED YYWLFHPVT LEDNTAEEQT QALLDALKED GSQCLIIGSN SDTHADKIME 
LMREFVKQDS DSYIFTSLPT RYYHSLVKHS QGLIGNSSSG LIEVPSLQVP 
TLNIGNRQFG RLSGPSWHV GTSKEAIVGG LGQLRDVIDF TNPFEQPDSA LQGYRAIKEF 
LSVQASTMKE FYDR 
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MKKVAFLGAG TFSDGVLPWL DRTRYELIGY FEDKPISDYR GYPVFGPLQD VLTYLDDGKV 
DAVFVTIGDN VKRKEIFDLL AKDHYDALFN IISEQANIFS PDSIKGRGVF 
IGFSSFVGAD SYVYDNCIIN TGAIVEHHTT VEAHCNITPG VTINGLCRIG ESTYIGSGST 
VIQCIEIAPY TTLGAGTWL KSLTESGTYV GVPARKIK 
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MEPICLIPAR SGSKGLPNKN MLFLDGVPMI FHTIRAAIES GCFKKENIYV STDSEVYKEI 
CETTGVQVLM RPADLATDFT TSFQLNEHFL QDFSDDQVFV LLQVTSPLRS 
GKHVKEAMEL YGKGQADHW SFTKVDKSPT LFSTLDENGF AKDIAGLGGS YRRQDEKTLY 
YPNGAIYISS KQAYLADKTY FSEKTAAYVM TKEDSIDVDD HFDFTGVIGR 
IYFDYQRREQ QNKPFYKREL KRLCEQRVHD SLVIGDSRLL ALLLDGFDNI SIGGMTASTA 
LENQGLFLAT PIKKVLLSLG VNDLITDYPL HMIEDTIRQL MESLVSKAEQ 
VFVTTIAYTL FRDSVSNEEI VQLNDVIVQS ASELGISVID LNEWEKEAM LDYQYTNDGL 
HFNQIGQERV NQLILTSLTR 
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ATCGCCAAAC GAAATTGGCA TTATTTGATA TGATAGCAGT TGCAATTTCT GCAATCTTAA CAAGTCATAT 
ACCAAATGCT GATTTAAATC GTTCTGGAAT TTTTATCATA 

ATGATGGTTC ATTATTTTGC ATTTTTTATA TCTCGTATGC CAGTTGAATT TGAGTATAGA GGTAATCTGA 
TAGAGTTTGA AAAAACATTT AACTATAGTA TAATATTTGC 

AATTTTTCTT ACGGCAGTAT CATTTTTGTT GGAGAATAAT TTCGCACTTT CAAGACGTGG TGCCGTGTAT 
TTCACATTAA TAAACTTCGT TTTGGTATAC CTATTTAACG 

TAATTATTAA GCAGTTTAAG GATAGCTTTC TATTTTCGA'C AATCTATCAA AAAAAGACGA TTCTAATTAC 
AACGGCTGAA CGATGGGAAA ATATGCAAGT TTTATTTGAA 

TCACATAAAC AAATTCAAAA AAATCTTGTT GCATTGGTAG TTTTAGGTAC AGAAATAGAT AAAATTAATT 
TATCATTACC GCTCTATTAT TCTGTGGAAG AAGCTATAGA 

GTTTTCAACA AGGGAAGTGG TCGACCACGT CTTTATAAAT CTACCAAGTG AGTTTTTAGA CGTAAAGCAA 
TTCGTTTCAG ATTTTGAGTT GTTAGGTATT GATGTAAGCG 

TTGATATTAA TTCATTCGGT TTTACTGCGT TGAAAAACAA AAAAATCCAA CTGCTAGGTG ACCATAGCAT 
TGTAACTTTT TCCACAAATT TTTATAAGCC TAGTCATATC 

ATGATGAAAC GACTTTTGGA TATACTCGGA GCGGTAGTCG GGTTAATTAT TTGTGGTATA GTTTCTATTT 
TGTTAGTTCC AATTATTCGT AGAGATGGTG GACCGGCTAT 

TTTTGCTCAG AAACGAGTTG GACAGAATGG ACGCATATTT ACATTCTACA AGTTTCGATC GATGTATGTT 
GATGCTGAGG AGCGCAAAAA AGACTTGCTC AGCCAAAACC 

AGATGCAAGG GTGGGTATGT TTTAAAATGG GAAAAACGAT CCTAGAATTA CTCCAATTGG ACATTTCATA 
CGCAAAAACA AGTTTAGACG AGTTACCACA GTTTTATAAT 
' GTTTTAATTG GCGATATGAG TCTAGTTGGT ACACGTCCAC Cf ACAGTTGA TGAATTTGAA AAATATACTC 
CTGGTCAAAA GAGACGATTG AGTTTTAAAC CAGGGATTAC 

AGGTCTCTGG CAGGTTAGTG GTCGTAGTAA TATCACAGAC TTCGACGACG TAGTTCGGTT GGACTTAGCA 
TACATTGATA ATTGGACTAT CTGGTCAGAT ATTAAAATTT 

TATTAAAGAC AGTGAAAGTT GTATTGTTGA GAGAGGGAAG TAAGTAAAAG TATATGAAAG TTTGTTTGGT 
CGGTTCTTCA GGGGGACATT TGACTCACTT GTATTTGTTA 

AAACCGTTTT GGAAGGAAGA AGAACGTTTT TGGGTAACAT TTGATAAAGA GGATGCAAGA AGTCTTTTGA 
AGAATGAAAA AATGTATCCA TGTTACTTTC CAACAAATCG 

CAATCTCATT AATTTAGTGA AAAATACTTT CTTAGCTTTC AAAATTTTAC GTGATGAGAA ACCAGATGTT 
ATTATTTCAT CTGGTGCGGC CGTTGCTGTC CCCTTCTTTT 

ACATCGGAAA ACTATTTGGA GCAAAGACGA TTTATATTGA AGTATTTGAT CGAGTTAATA AATCTACATT 
AACTGGAAAA CTAGTTTATC CCGTAACAGA TATTTTTATT 

GTTCAGTGGG AAGAAATGAA GAAGGTATAT CCTAAATCTA TTAACTTGGG GAGTATTTTT TAATGATTTT 
TGTAACAGTA GGAACTCATG AACAACAGTT TAATCGATTG 

ATAAAAGAGA TTGATTTATT GAAAAAAAAT GGAAGTATAA CCGACGAAAT ATTTATTCAA ACAGGATATT 
CTGACTATAT TCCAGAATAT TGCAAGTATA AAAAATTTCT 

CAGTTACAAA GAAATGGAAC AATATATTAA CAAATCAGAA GTAGTTATTT GCCACGGAGG CCCCGCTACT 
TTTATGAATT CATTATCCAA AGGAAAAAAA CAATTATTGT 

TTCCTAGACA AAAAAAGTAT GGTGAACATG TAAATGATCA TCAAGTAGAG TTTGTAAGAA GAATTTTACA 
AGATAATAAT ATTTTATTTA TAGAAAATAT AGATGATTTG 

TTTGAAAAAA TTATTGAAGT TTCTAAGCAA ACTAACTTTA CATCAAATAA TAATTTTTTT TGTGAAAGAT 
TAAAACAAAT AGTTGAAAAA TTTAATGAGG ATCAAGAAAA 

TGAATAATAA AAAAGATGCA TATTTGATAA TGGCTTATCA TAATTTTTCT CAGATTTTAC TGGAGAGGGA 
TACAGATATT ATCATCTTCT CTCAGGAGAA TGCACACCAT 

TAGTTCCTTC AGAATACCTG TATAATTATT TTAAATATTC TCAGGATTTA TATGTTGAAT TTACAAAAGA 
TGAGCAAAAA TATAAAGAAA ATAGGATATA TGAACGAGTT 

AAATGTTACA GATTATTTCC TAATATATCA GAAAAAACTA TTGATAATGT ACTGTTTAGA ATTTTATTAA 
GAATGTATCG AGCTTTTGAA TACTATTTAC AAAGATTGTT 

GTTTATTGAT AGAATAAAAA ACATGGTCTA AGAATAAGAT TTGGTTCTAA TTGGGTTTCG CTTCCACATG 
ATTTTGTGGC AATTCTTTTA TCAAATGAAA ACGAAACAGC 

TTATTTATTT AAGTAATCTA AATGTCCAGA TGAACTATTT ATACAGACAA TTATAGAAAA ATATGAATTT 
TCAAATAGAT TATCTAAATA TGGAAATTTA AGATATATAA 

AGTGGAAAAA ATCAACATCT TCTCCTATTG TCTTTACAGA TGATTCTATT GATGAATTGC TAAATGCAAG 
AAATTTAGGT TTTTTATTTG CTAGAAAGTT AAAAATAGAA 

AATAAATCTA AATTTAAAGA AATTATTACT AAAAAATAAA ATAGTTGATT TTGTGAGAGT AATGTATGTT 
TAAATTATTT AAATATGACC CGGAATATTT TATTTTTAAG 

TACTTCTGGT TGATTATTTT TATTCCAGAG CAAAAGTATG TATTTTTATT AATTTTTATG AATTTAATTT 
TATTTCATAT AAAATTTTTG AAAACTAAGC TAATATTAAA 

AAATGAAATT TTATTGTTTT TATTATGGTC TATATTATGT TTTGTTTCAG TAGTCACAAG TATGTTTGTT 
GAAATAAATT TTGAAAGATT ATTTGCAGAT TTTACTGCTC 

CCATAATTTG GATTATTGCA ATAATGTATT ATAATTTGTA TTCATTTATA AATATTGATT ATAAAAAATT 
AAAAAATAGT ATCTTTTTTA GTTTTTTAGT TTTATTAGGT 

ATATCTGCAT TGTATATTAT TCAAAATGGG AAAGATATTG TATTTTTAGA CAGACACCTT ATAGGACTAG 
ACTATCTTAT AACAGGCGTC AAAACAAGGT TGGTTGGCTT 

TATGAACTAT CCTACGTTAA ATACCACTAC AATTATAGTT TCAATTCCGT TAATCTTTGC ACTTATAAAA 
AATAAAATGC AACAATTTTT TTTCTTGTGT CTTGCTTTTA 
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TACCGATCTA TTTAAGTGGA TCGAGAATTG GTAGTTTATC GCTAGCAATA TTAATTATAT GCTTGTTATG 
GAGATATATA GGTGGAAAAT TTGCTTGGAT AAAAAAGCTA 

ATAGTAATAT TTGTAATACT ACTTATTATT TTAAATACTG AATTGCTTTA CCATGAAATT TTGGCTGTTT 
ATAATTCTAG AGAATCAAGT AACGAAGCTA GATTTATTAT 

TTATCAAGGA AGTATTGATA AAGTATTAGA AAACAATATT TTATTTGGAT ATGGAATATC CGAATATTCA 
GTTACGGGAA CTTGGCTCGG AAGTCATTCA GGCTATATAT 

CATTTTTTTA TAAATCAGGA ATAGTTGGGT TGATTTTACT GATGTTTTCT TTTTTTTATG TTATAAAAAA 
AAGTTATGGA GTTAATGGGG AAACAGCACT ATTTTATTTT 

ACATCATTAG CCATATTTTT CATATATGAA ACAATAGATC CGATTATTAT TATATTAGTA CTATTCTTTT 
CTTCAATAGG TATTTGGAAT AATATAAATT TTAAAAAGGA 

TATGGAGACA AAAAATGAAT GATTTAATTT CAGTTATTGT ACCAATTTAT AATGTCCAAG ATTATCTTGA 
TAAATGTATT AACAGTATTA TTAACCAAAC ATATACTAAT 

TTAGAGGTTA TTCTCGTAAA TGATGGAAGT ACTGATGATT CTGAGAAAAT TTGCTTAAAC TATATGAAGA 
ACGATGGAAG AATTAAATAT TACAAGAAAA TTAATGGCGG 

TCTAGCAGAT GCTCGAAATT TCGGACTAGA ACATGCAACA GGTAAATATA TTGCTTTTGT CGATTCTGAT 
GACTATATAG AAGTTGCAAT GTTCGAGAGA ATGCATGATA 

ATATAACTGA GTATAATGCC GATATAGCAG AGATAGATTT TTGTTTAGTA GACGAAAACG GGTATACAAA 
GAAAAAAAGA AATAGTAATT TTCATGTCTT AACGAGAGAA 

GAGACTGTAA AAGAATTTTT GTCAGGATCT AATATAGAAA ATAATGTTTG GTGCAAGCTT TATTCACGAG 
AT AT TAT AAA AGATATAAAA TTCCAAATTA ATAATAGAAG 

TATTGGTGAG GATTTGCTTT TTAATTTGGA GGTCTTGAAC AATGTAACAC GTGTAGTAGT TGATACTAGA 
GAATATTATT ATAATTATGT CATTCGTAAC AGTTCGCTTA 

TTAATCAGAA ATTCTCTATA AATAATATTG ATTTAGTCAC AAGATTGGAG AATTACCCCT TTAAGTTAAA 
AAGAGAGTTT AGTCATTATT TTGATGCAAA AGTTATTAAA 

GAGAAGGTTA AATGTTTAAA CAAAATGTAT TCAACAGATT GTTTGGATAA TGAGTTCTTG CCAATATTAG 
AGTCTTATCG AAAAGAAATA CGTAGATATC CATTTATTAA 

AGCGAAAAGA TATTTATCAA GAAAGCATTT AGTTACGTTG TATTTGATGA AATTTTCGCC TAAACTATAT 
GTAATGTTAT ATAAGAAATT TCAAAAGCAG TAGAGGTAAA 

AATGGATAAA ATTAGTGTTA TTGTTCCAGT TTATAATGTA GATAAATATT TAAGTAGTTG TATAGAAAGC 
ATTATTAATC AAAATTATAA AAATATAGAA AT AT TAT TG A 

TAGATGATGG CTCTGTAGAT GATTCTGCTA AAATATGCAA GGAATATGCA GAAAAAGATA AAAGAGTAAA 
AATTTTTTTC ACTAATCATA GTGGAGTATC AAATGCTAGA 

AATCATGGAA TAAAGCGGAG TACAGCTGAA TATATTATGT TTGTTGACTC TGATGATGTT GTTGATAGTA 
GATTAGTAGA AAAATTATAT TTTAATATTA TAAAAAGTAG 

AAGTGATTTA TCTGGTTGTT TGTACGCTAC TTTTTCAGAA AATATAAATA ATTTTGAAGT GAATAATCCA 
AATATTGATT TTGAAGCAAT TAATACCGTG CAGGACATGG 

GAGAAAAAAA TTTTATGAAT TTGTATATAA ATAATATTTT TTCTACTCCT GTTTGTAAAC TATATAAGAA 
AAGATACATA ACAGATCTTT TTCAAGAGAA TCAATGGTTA 

GGAGAAGATT TACTTTTTAA TCTGCATTAT TTAAAGAATA TAGATAGAGT TAGTTATTTG ACTGAACATC 
TTTATTTTTA TAGGAGAGGT ATACTAAGTA CAGTAAATTC 

TTTTAAAGAA GGTGTGTTTT TGCAATTGGA AAATTTGCAA AAACAAGTGA TAGTATTGTT TAAGCAAATA 
TATGGTGAGG ATTTTGACGT ATCAATTGTT AAAGATACTA 

TACGTTGGCA AGTATTTTAT TATAGCTTAC TAATGTTTAA ATACGGAAAA CAGTCTATTT TTGACAAATT 
TTTAATTTTT AGAAATCTTT ATAAAAAATA TTATTTTAAC 

TTGTTAAAAG TATCTAACAA AAATTCTTTG TCTAAAAATT TTTGTATAAG AATTGTTTCG AACAAAGTTT 
TTAAAAAAAT ATTATGGTTA TAATAGGAAG ATATCATGGA 

TACTATTAGT AAAATTTCTA TAATTGTACC TATATATAAT GTAGAAAAAT ATTTATCTAA ATGTATAGAT 
AGCATTGTAA ATCAGACCTA CAAACATATA GAGATTCTTC 

TGGTGAATGA CGGTAGTACG GATAATTCGG AAGAAATTTG TTTAGCATAT GCGAAGAAAG ATAGTCGCAT 
TCGTTATTTT AAAAAAGAGA ACGGCGGGCT ATCAGATGCC 

CGTAATTATG GCATAAGTCG CGCCAAGGGT GACTACTTAG CTTTTATAGA CTCAGATGAT TTTATTCATT 
CGGAGTTCAT CCAACGTTTA CACGAAGCAA TTGAGAGAGA 

GAATGCCCTT GTGGCAGTTG CTGGTTATGA TAGGGTAGAT GCTTCGGGGC ATTTCTTAAC AGCAGAGCCG 
CTTCCTACAA ATCAGGCTGT TCTGAGCGGC AGGAATGTTT 

GTAAAAAGCT GCTAGAGGCG GATGGTCATC GCTTTGTGGT GGCCTGTAAT AAACTCTATA AAAAAGAACT 
ATTTGAAGAT TTTCGATTTG AAAAGGGTAA GATTCATGAA 

GATGAATACT TCACTTATCG CTTGCTCTAT GAG TT AG AAA AAGTTGCAAT AGTTAAGGAG TGCTTGTACT 
ATTATGTTGA CCGAGAAAAT AGTATCACAA CTTCTAGCAT 

GACTGACCAT CGCTTCCATT GCCTACTGGA ATTTCAAAAT GAACGAATGG ACTTCTATGA AAGTAGAGGA 
GATAAAGAGC TCTTACTAGA GTGTTATCGT TCATTTTTAG 

CCTTTGCTGT TTTGTTTTTA GGCAAATATA ATCATTGGTT GAGCAAACAG CAAAAGAAGC TT 
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RQTKLALFDM IAVAISAILT SHIPNADLNR SGIFIIMMVH YFAFFISRMP VEFEYRGNLI 
EFEKTFNYSI IFAI FLTAVS FLLENNFALS RRGAVYFTLI NFVLVYLFNV 
IIKQFKDSFL FSTIYQKKTI LITTAERWEN MQVLFESHKQ IQKNLVALW LGTEIDKINL 
SLPLYYSVEE AIEFSTREW DHVFINLPSE FLDVKQFVSD FELLGIDVSV 
DINSFGFTAL KNKKIQLLGD HSIVTFSTNF YKPSHIMMKR LLDILGAWG LIICGIVSIL 
LVPIIRRDGG PAIFAQKRVG QNGRIFTFYK FRSMYVDAEE RKKDLLSQNQ 
MQGWVCFKMG KTILELLQLD ISYAKTSLDE LPQFYNVLIG DMSLVGTRPP TVDEFEKYTP 
GQKRRLSFKP GITGLWQVSG RSNITDFDDV VRLDLAYIDN WTIWSDIKIL 
LKTVKWLLR EGSK 
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MKVCLVGSSG GHLTHLYLLK PFWKEEERFW VTFDKEDARS LLKNEKMYPC YFPTNRNLIN 
LVKNTFLAFK ILRDEKPDVI ISSGAAVAVP FFYIGKLFGA KTIYIEVFDR 
VNKSTLTGKL VYPVTDIFIV QWEEMKKVYP KSINLGSIF 
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MIFVTVGTHE QQFNRLIKEI DLLKKNGSIT DEIFIQTGYS DYIPEYCKYK KFLSYKEMEQ 
YINKSEWIC HGGPATFMNS LSKGKKQLLF PRQKKYGEHV NDHQVEFVRR 
ILQDNNILFI ENIDDLFEKI IEVSKQTNFT SNNNFFCERL KQIVEKFNED QENE 
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MFKLFKYDPE Y^KYFWLI 
WSILCFVSW TSMFVEINFE 
KLKNSIFFSF LVLLGISALY 
TTIIVSIPLI FALIKNKMQQ 
LLWRYIGGKF AWIKKLIVIF 
VLENNILFGY GISEYSVTGT 
IKKSYGVNGE TALFYFTSLA 
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IFIPEQKYVF LLIFMNLILF 
RLFADFTAPI IWIIAIMYYN 
IIQNGKDIVF LDRHLIGLDY 
FFFLCLAFIP IYLSGSRIGS 
VILLIILNTE LLYHEILAVY 
WLGSHSGYIS FFYKSGIVGL 
IFFIYETIDP IIIILVLFFS 



HIKFLKTKLI LKNEILLFLL 
LYSFINIDYK 

LITGVKTRLV GFMNYPTLNT 
LSPLAILIIC 

NSRESSNEAR FIIYQGSIDK 
ILLMFSFFYV 

SIGIWNNINF KKDMETKNE 
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MNDLISVIVP IYNVQDYLDK 
KYYKKINGGL ADARNFGLEH 
NADIAEIDFC LVDENGYTKK 
IKFQINNRSI GEDLLFNLEV 
SINNIDLVTR LENYPFKLKR 
EIRRYPFIKA KRYLSRKHLV 
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CINSIINQTY TNLEVILVND 
ATGKYIAFVD SDDYIEVAMF 
KRNSNFHVLT REETVKEFLS 
LNNVTRWVD TREYYYNYVI 
EFSHYFDAKV IKEKVKCLNK 
TLYLMKFSPK LYVMLYKKFQ 



GSTDDSEKIC LNYMKNDGRI 
ERMHDNITEY 

GSNIENNVWC KLYSRDIIKD 
RNSSLINQKF 

MYSTDCLDNE FLPILESYRK 
KQ 
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MDKISVIVPV YNVDKYLSSC 
FFTNHSGVSN ARNHGIKRST 
DLSGCLYATF SENINNFEVN 
YITDLFQENQ WLGEDLLFNL 
KEGVFLQLEN LQKQVIVLFK 
IFRNLYKKYY FNLLKVSNKN 
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IESIINQNYK NIEILLIDDG 
AEYIMFVDSD DWDSRLVEK 
NPNIDFEAIN TVQDMGEKNF 
HYLKNIDRVS YLTEHLYFYR 
QIYGEDFDVS IVKDTIRWQV 
SLSKNFCIRI VSNKVFKKIL 



SVDDSAKICK EYEKDECRVKI 
LYFNIIKSRS 

MNLXXNNIFS TPVCXLYQKR 
RGILSTVNSF 

FYYSLLMFKY GKQSIFDKFL 
WL 
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MDTISKISII VPIYNVEKYL SKCIDSIVNQ TYKHIEILLV NDGSTDKSfcfc' ICLAYAKKDS 
RIRYFKKENG GLSDARNYGI SRAKGDYLAF IDSDDFIHSE FIQRLHEAIE 
RENALVAVAG YDRVDASGHF LTAEPLPTNQ AVLSGRNVCK KLLEADGHRF WACNKLYKK 
ELFEDFRFEK GKIHEDEYFT YRLLYELEKV AIVKECLYYY VDRENSITTS 
SMTDHRFHCL LEFQNERMDF YESRGDKELL LECYRSFLAF AVLFLGKYNH WLSKQQKK 
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AAGCTTATCG TCAAGGTGTT CGCTATATCG 
TTGAAACACC AGAAAAAGTT ATCATGACTA 
GCAGTAGCAG AAGTTTATCC TGAAATACGA 
AAAGATATAT TAAGCAAACT TGAAAAAAAG 
CTCGCGCTAT ATTCTTTTGG AGTTCAGTAG 
AGTGAACGAA GTGACGCTAC TTGGGCTAAC 
AACGATATGA CGCCCTAGCG TTTCATGCAG 
GCTATACTCA GGTAAATAGT AATCATGTGC 
GATCGAGCAA AAGAATTTAA AAAACGTACT 
TGTGTTGCTA GCGATATGCA TAATTTATCT 
GGAGGCTTAT AAGTTGCTAA CAGAGGAATT 
AAAGAATCCT CTTATGCTAT TAAAAAACCA 
CTAGATTGTG GAGAGAAAAA TGGATTTAGG 
CAGTAAACGA TTGATACTCG TGTGCATGGA 
CCATGATTTT GAGCAGACTG TTTTTGGATG 
TTCTTGCAGT TTTATTCGTA TCAATTTTAT 
TTAAAAGTCT TTTCATTAAT TACGCGTTAC 
CTTAGTTTAA TATCTGCGCA TTCATTGTTT 
GTGGCAGGCT TTTAGTTATC GTTTCATCTT 
CATTACTCCG AGGATTGTTT GGAAAGTCTT 
CTATCCGTAA GAAGGATAGC CCACTAAGAA 
ATATTTTTAT CAATACTGTC AAAGATCGAA 
GGTATCGTTG ATCGTGATCC AAATAAACTT 
GGAAACCGTA ATGATATTCC ACGACTGGTA 
AGTGACGATT GCCATCCCTT CTTTAAATGG 
TAACACTACA GGAGTGACCG TCAATAATAT 
TGGCGGGGAA CATGTCTGTC AGTGCCTTTC 
GACCAGAGGT TGTTTTGGAT CAGGATGAAT 
AAAACAATCC TTGTCACAGG AGCAGGTGGC 
GCTAAGTTTA CGCCTAAACG CTTGTTGTTG 
AATCTATCTC ATTCATCGAG AGTTACTGGA 
TCTCATTGCA GATATTCAAG ATAGAGAATT 
AATATCAACC CGATGTTGTT TATCATGCTG 
ATAATCCACA TGAAGCAGTG AAGAATAATA 
GCTGAGGCGG CTAAAACTGC AAAGGTTGCC 
GTTAATCCAC CAAATGTCAT GGGAGCGACT 
TGTTACAGGT TTAAACGAGC CAGGTCAGAC 
TCTAGGTAGT CGTGGAAGTG TTGTTCCGCT 
AAGGTGGACC TGTTACGGTT ACCGACTTTA 
AGGCAAGTCG TTTGGTTATC CAAGCTGGAC 
ATATTTGTCT TGGATATGGG CGAGCCAGTA 
TTGTTAAGTG GACACACAGA GGAAGAAATC 
CAGACCAGGC GAGAAACTCT ACGAGGAATT 
GATTCATGAA AAAATATTTG TGGGTCGCGT 
TTGTCAATTC ATTTATCAAT GGATTACTCC 
TGATTGAATT TGCAAAACAA GAATAAGAAA 
CCTAGAGTTT AAACGATGTT TAAGTTCTAG 
TTACTATTTA TTAAGAGTCA GATAATAGCA 
TTTATAATAA GTATATTTGG TCAAAAGGGA 
TTTTAGCAAT TATTATCTCA GGGATTGCTA 
TTATTATTGA TTGCATTGGC AATTAAATTA 
AAGCGGGTTG GTAAAAACAA GTCATACTTT 
TATGTACGTT GACGCACCAA GTGATATGCC 
GATTACCAAG GTGGGCGCGT TTCTCAGAAA 
CACAGCTTTT TAATATTTTT AAAGGTGAAA 
GGAATCAATA TGACTTAATT GAAGAGCGAG 
ATTCGTCCTG GACTAACCGG TTGGGCTCAA 
GAAAAGTCAA AATTAGATGG ATATTATGTT 
GGATATTAAA TGTTTCTTAG GTACATTCCT 
AGGTGGAACA GGGCAGAAAG GAAAAGGATG 
GGTCTATGAG AAAGAAAAAC CAGAGTTTCT 
TCAAACAATG ATTCCAACGG AGGTTGTCTT 
ATCAGAGCTT ATATAGTATT TTAGAAGAAT 
TAGCCTTGGA AAAGAATTCG GGTTTAGGAA 
AAACATTGTA ATTATGAGTG GGTTTGCACG 
ACACGTTTTG AAAAGCAAGT TAACTTTATA 



TGGCGACATC TCATAGACGA AAAGGGATGT 
ACTTTCTTCA ATTTAAAGAC 
TTGTGCTATG GTGCTGAATT GTATTATAGT 
AAAGTACCCA CACTTAATGG 
TGATACTCCT TGGAAAGAGA TTCAAGAAGC 
TCCCGTACTT GCCCATATAG 
AGAGAGTAGA AGAGTTAATT GACAAGGGAT 
TGAAGCCCAC TTTAATTGGT 
CGGTATTTTT TAGAGCAGGA TTTAGTACAT 
AGTAGACCTC CGTTTATGAG 
TGGCAAAGAT AAAGCGAAAG CGTTGCTAAA 
GGCGATTTAA ACTGGTTACT 
AACTGTTACT GATAAACTGT TAGAACGCAA 
TACGTGTCTT CTTATAGTTT 
TTATTATTGA CATACCAGAT GAACGCTTCA 
ATTTGATTCT ATCGTTTAGA 
ACAGGGTATC AGAGTTATGT AAAAATAGGA - 
TTAATTATCT CAATGGTGTT 
AGTATCCTTA TTTTTGTCGT ATGTAATGCT 
ACATGAGACG AGAAAAAATG 
TCTTAGTAGT AGGTGCTGGA GATGGTGGTA 
AATTGAATTT TGAAATTGTC 
GGAACATTTA TCCGTACGGC TAAAGTTTTA 
GAGGAATTAG CTGTTGACCA 
TAAGGAGCGA GAGAAGATTG TTGAAATCTG 
GCCGAGTATT GAAGACATTA 
AGGAAATTGA CGTAGCAGAC CTTCTTGGTC 
TGAATCAGTT TTTCCAAGGG 
TCTATCGGTT CAGAGCTATG TCGTCAAATT 
CTTGGACATG GAGAAAATTC 
AAAGTACCAA GGTAAGATTG AGTTGGTCCC 
GATTTTTAGC ATAATGGCTG 
CAGCACATAA GCATGTTCCT TTGATGGAAT 
TTTTTGGAAC GAAGAATGTG 
AAATTTGTTA TGGTTTCAAC AGATAAAGCT 
AAACGTGTTG CAGAAATGAT 
TCAATTTGCG GCAGTCCGGT TTGGGAATGT 
ATTCAAAGAG CAAATTAGAA 
GGATGACTCG TTATTTCATG ACGATTCCTG 
ATTTGGCAAA AGGTGGAGAA 
CAAATCCTGG AATTGGCAAG AAAAGTTATC 
GGGATTGTAG AATCTGGAAT 
ATTATCAACA GAAGAACGTG TCAGCGAACA 
TACAAATAAG CAGTCGGACA 
AAAAAGATAG AAATGAATTA AAAAATATGT 
GTAAAAAATA TTTTTACTTT 
GAAGGTTAGA ATACCTAATT AACAACAATA 
ACTAAGTGCT ACAAACTATC 
GATGTGAAAT GTATCCAATT TGTAAACGTA 
TTGTTGTTCT GAGTCCAATT 
GATTCTAAAG GTCCGGTATT ATTTAAACAA 
ATGATTTATA AATTCCGTTC 
GACTCATCTA TTAAAGGATC CTAAGGCGAT 
AACAAGTTTA GATGAACTGC 
TGGCGATTGT TGGTCCACGC CCAGCCTTAT 
ATAAATATGG TGCAAATGAT 
ATTAATGGTC GTGATGAATT GGAAATTGAT 
CAAAATATGA GTCTAGGTTT 
CAGTGTAGCC AGAAGCGAAG GTGTTGTTGA 
AAATTTTCAG TATTAATGTC 
TAGGGAATCT TTGGAAAGCA TCCTTGTCAA 
GGTAGAGGAT GGGCCACTCA 
TTAAAAGTCG ATTTTCATTT TTTAAAACGA 
TTGCACTGAA TGAAGGTTTG 
AAATGGATTC TGATGATGTT GCATATACAT 
AAACAAAACC CGACTATAGA 
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TATTGAGATA GATGAGTTCT TAAATTCTAC TAGTGAAATA GTTTCTCATA AAAATGTTCC 
AACCCAGCAC GATGAAATAT TAAAGATGGC AAGGCGGGAG AAATCCATGT 
GCCACATGAC TGTAATGTTT AAAAAGAAAA GTGTCGAGAG AGCAGGGGGG TATCAAACAC 
TTCCGTACGT AGAAGATTAT TTCCTTTGGG TGCGCATGAT TGCTTCAGGA 
TCGAAATTTG CAAACATTGA TGAAACACTA GTTCTTGCAC GTGTTGGAAA TGGGATGTTC 
AATAGGAGGG GGAACAGAGA ACAAATTAAC AGTTGGACAT TACTAATTGA 
ATTTATGTTA GCTCAAGGAA TTGTTACACC ACTAGATGTA TTTATTAATC AAATTTACAT 
TAGGGTCTTT GTTTATATGC CAACTTGGAT AAAGAAACTC ATTTATGGAA 
AAATCTTAAG GAAATAGTAT GATTACAGTA TTGATGGCTA CATATAATGG AAGCCCATTT 
ATAATAAAAC AGTTAGATTC AATTCGAAAT CAAAGTGTAT CAGCAGACAA 
AGTTATTATT TGGGATGATT GCTCGACAGA TGATACAATA AAAATAATAA AAGATTATAT 
AAAAAAATAT TCTTTGGATT CATGGGTTGT CTCTCAAAAT AAATCTAATC 
AGGGGCATTA TCAAACATTT ATAAATTTGA CAAAGTTAGT TCAGGAAGGA ATAGTCTTTT 
TTTCAGATCA AGATGATATT TGGGACTGTC ATAAAATTGA GACAATGCTT 
CCAATCTTTG ACAGAGAAAA TGTATCAATG GTGTTTTGCA AATCCAGATT GATTGATGAA 
AACGGAAATA TTATCAGTAG CCCAGATACT TCGGATAGAA TCAATACGTA 
CTCTCTAGA 
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AYRQGVRYIV ATSHRRKGMF ETPEKVIMTN FLQFKDAVAE VYPEIRLCYG AELYYSKDIL 
SKLEKKKVPT LNGSRYILLE FSSDTPWKEI QEAVNEVTLL GLTPVLAHIE 
RYDALAFHAE RVEELIDKGC YTQVNSNHVL KPTLIGDRAK EFKKRTRYFL EQDLVHCVAS 
DMHNLSSRPP FMREAYKLLT EEFGKDKAKA LLKKNPLMLL KNQAI 
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MDLGTVTDKL LERNSKRLIL 
YLILSFRLKV FSLITRYTGY 
RFILVSLFLS YVMLITPRIV. 
KLNFEIVGIV DRDPNKLGTF 
SLNGKEREKI VEICNTTGVT 
LNQFFQGKTI LVTGAGGSIG 
ELLEKYQGKI ELVPLIADIQ 
IFGTKNVAEA AKTAKVAKFV 
PGQTQFAAVR FGNVLGSRGS 
HLAKGGEIFV LDMGEPVQIL 
YEELLSTEER VSEQIHEKIF 



43/59 

VCMDTCLLIV SMILSRLFLD 
QSYVKIGLSL ISAHSLFLII 
WKVLHETRKN AIRKKDSPLR 
IRTAKVLGNR NDI PRLVEEL 
VNNMPSIEDI MAGNMSVSAF 
SELCRQIAKF TPKRLLLLGH 
DRELIFSIMA EYQPDWYHA 
MVSTDKAVNP PNVMGATKRV 
WPLFKEQIR KGGPVTVTDF 
ELARKVILLS GHTEEEIGIV 
VGRVTNKQSD IVNSFINGLL 



VIIDIPDERF ILAVLFVSIL 
SMVLWQAFSY 

ILWGAGDGG NIFINTVKDR 
AVDQVTIAIP 

QEIDVADLLG RPEWLDQDE 
GENSIYLIHR 

AAHKHVPLME YNPHEAVKNN 
AEMIVTGLNE 

RMTRYFMTIP EASRLVIQAG 
ESGIRPGEKL 
QKDRNELKNM LIEFAKQE 
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MYPICKRILA IIISGIAIW LSPILLLIAL AIKLDSKGPV LFKQKRVGKN KSYFMIYKFR 
SMYVDAPSDM PTHLLKDPKA MITKVGAFLR KTSLDELPQL FNIFKGEMAI 
VGPRPALWNQ YDLIEERDKY GANDIRPGLT GWAQINGRDE LEIDEKSKLD GYYVQNMSLG 
LDIKCFLGTF LSVARSEGW EGGTGQKGKG 
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MKFSVLMSVY EKEKPEFLRE SLESILVNQT MIPTEWLVE DGPLNQSLYS ILEEFKSRFS 
FFKTIALEKN SGLGIALNEG LKHCNYEWVC TKWILMMLHI HTRFEKQVNF 
IKQNPTIDIE IDEFLNSTSE IVSHKNVPTQ HDEILKMARR EKSMCHMTVM FKKKSVERAG 
GYQTLFYVED YFLWVRMIAS GSKFANIDET LVLARVGNGM FNRRGNREQI 
NSWTLLIEFM LAQGIVTPLD VFINQIYIRV FVYMPTWIKK LIYGKILRK 
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MITVLMATYN GSPFIIKQLD SIRNQSVSAD KVIIWDDCST DDTIKIIKDY IKKYSLDSWV 
VSQNKSNQGH YQTFINLTKL VQEGIVFFSD QDDIWDCHKI ETMLPIFDRE 
NVSMVFCKSR LIDENGNIIS SPDTSDRINT YSL 
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CTGCAGCACA TAAGCATGTT CCATTGATGG AATATAATCC ACATGAAGCA GTGAAGAATA 
ATATTTTTGG AACGAAGAAT GTGGCTGAGG CGGCTAAAAC TGCAAAGGTT 
GCCAAATTTG TTATGGTTTC AACAGATAAA GCTGTTAATC CGCCAAATGT CATGGGAGCG 
ACTAAACGTG TTGCAGAAAT GATTGTAACA GGTTTAAACG AGCCAGGTCA 
GACTCAATTT GCGGCAGTCC GTTTTGGGAA TGTTCTAGGT AGTCGTGGAA GTGTTGTTCC 
GCTATTCAAA GAGCAAATTA GAAAAGGTGG ACCTGTTACG GTTACCGACT 
TTAGGATGAC TCGTTATTTC ATGACGATTC CTGAGGCAAG TCGTTTGGTT ATCCAAGCTG 
GACATTTGGC AAAAGGTGGA GAAATCTTTG TCTTGGATAT GGGTGAGCCA 
GTACAAATCC TGGAATTGGC AAGAAAAGTT ATCTTGTTAA GCGGACATAC AGAGGAAGAA 
ATCGGGATTG TAGAATCTGG AATCAGACCA GGCGAGAAAC TCTACGAGGA 
ATTGTTATCA ACAGAAGAAC GTGTCAGCGA ACAGATTCAT GAAAAAATAT TTGTGGGTCG 
CGTTACAAAT AAGCAGTCGG ACATTGTCAA TTCATTTATC AATGGATTAC 
TCCAAAAAGA TAGAAATGAA TTAAAAGATA TGTTGATTGA ATTTGCAAAA CAAGAATAAG 
AAAGTAAAAA ATATTTTTAC TTTCCTAGAG TTTAAACGAT GTTTAAGTTC 
TAGGAAGGTT GGAATTGCTT TCGTGGAGGT GATAGATAGA AACCTATATA TTTGTAGAAG 
AAAGGATATT AAACTAAAGG TGAATCGGAA CATAAAGTTT AGATAGAGTT 
GGTATTTAAT GCCAAACAGG TGAATGCAAC CTCTCGCTCG TTACTAAGCA GGAGATAGTA 
AAGTTGCTTG AAAGAGAGTT TGTTAATCAG TATAAGTAGG CTAAAGTGAG 
AATATATATC TATTATTATC GGTAATGATA CTATTATTGA GAATTATTGT AGTGGGGATA 
AAAATAATTT TTGGTGATTT TATCGTCCGA CTTAAAGGTG GGTTAAAAAA 
GTACTTATAT TCTTTTAGAA TTGATGAAAA ATATGGGGGA ATATAATATT TATAGGAGAT 
ACGATGACTA GAGTAGAGTT GATTACTAGA GAATTTTTTA AGAAGAATGA 
AGCAACCAGT AAATATTTTC AGAAGATAGA ATCAAGAAGA GGTGAATTAT TTATTAAATT 
CTTTATGGAT AAGTTACTTG CGCTTATCCT ATTATTGCTA TTATCCCCAG 
TAATCATTAT ATTAGCTATT TGGATAAAAT TAGATAGTAA GGGGCCAATT TTTTATCGCC 
AAGAACGTGT TACGAGATAT GGTCGAATTT TTAGAATATT TAAGTTTAGA 
ACAATGATTT CTGATGCGGA TAAAGTCGGA AGTCTTGTCA CAGTCGGTCA AGATAATCGT 
ATTACGAAAG TCGGTCACAT TATCAGAAAA TATCGGCTGG ACGAAGTGCC 
CCAACTTTTT AATGTTTTAA TGGGGGATAT GAGCTTTGTA GGTGTAAGAC CAGAAGTACA 
AAAATATGTA AATCAGTATA CTGATGAAAT GTTTGCGACG TTACTTTTAC 
CTGCAGGAAT TACTTCACCA GCGAGTATTG CATATAAGGA TGAAGATATT GTTTTAGAAG 
AATATTGTTC TCAAGGCTAT AGTCCTGATG AAGCATATGT TCAAAAAGTA 
TTACCAGAAA AAATGAAGTA CAATTTGGAA TATATCAGAA ACTTTGGAAT TATTTCTGAT 
TTTAAAGTAA TGATTGATAC AGTAATTAAA GTAATAAAAT AGGAGATTAA 
AATGACAAAA AGACAAAATA TTCCATTTTC ACCACCAGAT ATTACCCAAG CTGAAATTGA 
TGAAGTTATT GACACACTAA AATCTGGTTG GATTACAACA GGACCAAAGA 
CAAAAGAGCT AGAACGTCGG CTATCAGTAT TTACAGGAAC CAATAAAACT GTGTGTTTAA 
ATTCTGCTAC TGCAGGATTG GAACTAGTCT TACGAATTCT TGGTGTTGGA 
CCCGGAGATG AAGTTATTGT TCCTGCTATG ACCTATACTG CCTCATGTAG TGTCATTACT 
CATGTAGGAG CAACTCCTGT GATGGTTGAT ATTCAAAAAA ACAGCTTTGA 
GATGGAATAT GATGCTTTGG AAAAAGCGAT TACTCCGAAA ACAAAAGTTA TCATTCCTGT 
TGATCTAGCT GGTATTCCTT GTGATTATGA TAAGATTTAT ACCATCGTAG 
AAAACAAACG CTCTTTGTAT GTTGCTTCTG ATAATAAATG GCAGAAACTT TTTGGGCGAG 
TTATTATCCT ATCTGATAGT GCACACTCAC TAGGTGCTAG TTATAAGGGA 
AAACCAGCGG GTTCCCTAGC AGATTTTACC TCATTTTCTT TCCATGCAGT TAAGAATTTT 
ACAACTGCTG AAGGAGGTAG TGTGACATGG AGATCACATC CTGATTTGGA 
TGACGAAGAG ATGTATAAAG AGTTTCAGAT TTACTCTCTT CATGGTCAGA CAAAGGATGC 
ATTAGCTAAG ACACAATTAG GGTCATGGGA ATATGACATT GTTATTCCTG 
GTTACAAGTG TAATATGACA GATATTATGG CAGGTATCGG TCTTGTGCAA TTAGAACGTT 
ACCCATCTTT GTTGAATCGT CGCAGAGAAA TCATTGAGAA ATACAATGCT 
GGCTTTGAGG GGACTTCGAT TAAGCCGTTG GTACACCTGA CGGAAGATAA ACAATCGTCT 
ATGCACTTGT ATATCACGCA TCTACAAGGC TATACTTTAG AACAACGAAA 
TGAAGTCATT CAAAAAATGG CTGAAGCAGG TATTGCGTGC AATGTTCACT ACAAACCATT 
ACCTCTTCTC ACAGCCTACA AGAATCTTGG TTTTGAAATG AAAGATTTTC 
CGAATGCCTA TCAGTATTTT GAAAATGAAG TTACACTGCC TCTTCATACC AACTTGAGTG 
ATGAAGATGT GGAGTATGTG ATAGAAATGT TTTTAAAAAT TGTTAGTAGA 
GATTAGTTAT TTTGGAAGGA GATATGGTGG AAAGAGATAT GGTGGAAAGA GACACGTTGG 
TATCTATAAT AATGCCCTCG TGGAATACAG CTAAGTATAT ATCTGAATCA 
ATCCAGTCAG TGTTGGACCA AACACACCAA AATTGGGAAC TTATAATCGT TGATGATTGT 
TCTAATGACG AAACTGAAAA AGTTGTTTCG CATTTCAAAG ATTCAAGAAT 
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AAAGTTTTTT AAAAATTCGA ATAATTTAGG GGCAGCTCTA ACACGAAATA AGGCACTAAG 
AAAAGCTAGA GGTAGGTGGA TTGCGTTCTT GGATTCAGAT GATTTATGGC 
ACCCGAGTAA GCTAGAAAAA CAGCTTGAAT TTATGAAAAA TAATGGATAT TCATTTACTT 
ATCACAATTT TGAAAAGATT GATGAATCTA GTCAGTCTTT ACGTGTCCTG 
GTGTCAGGAC CAGCAATTGT GACTAGAAAA ATGATGTACA ATTACGGCTA TCCAGGGTGT 
TTGACTTTCA TGTATGATGC AGACAAAATG GGTTTAAT TC AGATAAAAGA 
TATAAAGAAA AATAACGATT ATGCGATATT ACTTCAATTG TGTAAGAAGT ATGACTGTTA 
TCTTTTAAAT GAAAGTTTAG CTTCGTATCG AATTAGAAAA AA 
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AAHKHVPLME YNPHEAVKNN I FGTKNVAEA AKTAKVAKFV MVSTDKAVNP PNVMGATKRV 
AEMIVTGLNE PGQTQFAAVR FGNVLGSR6S WPLFKEQIR KGGPVTVTDF 
RMTRYFMTIP EASRLVIQAG HLAKGGEIFV LDMGEPVQIL ELARKVILLS GHTEEEIGIV 
ESGIRPGEKL YEELLSTEER VSEQIHEKIF VGRVTNKQSD IVNSFINGLL 
QKDRNELKDM LIEFAKQE 
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MTRVELITRE FFKKNEATSK YFQKIESRRG ELFIKFFMDK LLALILLLLL SPVIIILAIW 
IKLDSKGPIF YRQERVTRYG RIFRIFKFRT MISDADKVGS LVTVGQDNRI 
TKVGHIIRKY RLDEVPQLFN VLMGDMSFVG VRPEVQKYVN QYTDEMFATL LLPAGITSPA 
SIAYKDEDIV LEEYCSQGYS PDEAYVQKVL PEKMKYNLEY IRNFGIISDF 
KVMIDTVIKV IK 
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MTKRQNIPFS PPDITQAEID EVIDTLKSGW ITTGPKTKEL ERRLSVFTGT NKTVCLNSAT 
AGLELVLRIL GVGPGDEVIV PAMTYTASCS VITHVGATPV MVDIQKNSFE 
MEYDALEKAI TPKTKVIIPV DLAGIPCDYD KIYTIVENKR SLYVASDNKW QKLFGRVIIL 
SDSAHSLGAS YKGKPAGSLA DFTSFSFHAV KNFTTAEGGS VTWRSHPDLD 
DEEMYKEFQI YSLHGQTKDA LAKTQLGSWE YDIVIPGYKC NMTDIMAGIG LVQLERYPSL 
LNRRREIIEK YNAGFEGTSI KPLVHLTEDK QSSMHLYITH LQGYTLEQRN 
EVIQKMAEAG IACNVHYKPL PLLTAYKNLG FEMKDFPNAY QYFENEVTLP LHTNLSDEDV 
EYVIEMFLKI VSRD 



Fig. 6 cont. 



CPS7G 



WO 00/05378 PCT/NL99/00460 



52/59 

MVERDMVERD TLVSIIMPSW NTAKYISESI QSVLDQTHQN WELIIVDDCS NDETEKWSH 
FKDSRIKFFK NSNNLGAALT RNKALRKARG RWIAFLDSDD LWHPSKLEKQ 
LEFMKNNGYS FTYHNFEKID ESSQSLRVLV SGPAIVTRKM MYNYGYPGCL TFMYDADKMG 
LIQIKDIKKN NDYAILLQLC KKYDCYLLNE . SLASYRIRK 
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